Markup transfer – nightmare or a piece of cake?

train Markup transfer   nightmare or a piece of cake?
The whole process of creating a markup and developing tests is time consuming. And when it seems that the work is done, a new version of spec is released. What happens next? Of course there is a need of a new version of the test suite. New tests must be written and the old ones updated or even deleted.

The best way to start is to do the markup. This task can be devided into two subtasks:

  • transfer old markup from previous spec to the new spec (this is needed because many tests were already  written, they are linked to spec id’s, reusing all possible tests is a good idea);
  • markup new and updated assertions.

Transferring the markup is simple enough to do it by hand:

  1. Find the markup tag in the old spec.
  2. Find the best place to insert the tag in the new version of spec.
  3. Insert the tag.

If there are only 10 assertions – this work is a piece of cake. But if there are thousands it is a hard job that should be automated. The hardest part is to find a new proper place for markup tags. It’s hard just because the spec was changed. For JLS2 to JLS3 migration process the flollowing alrorithm was used:

Each assertion is rounded with html anchors. Both of them should be transferred using such algorithm.

Hin 1t: if some tag is transfered, there is a great possibility, that next tag in old spec will be positioned after the one that is transfered.

Hint 2: algorithm should check that the second anchor should be positioned after the first one and not too far from it.

  1. Look at the text before and after tag in old spec. Find it in the new spec. If one of them72s Markup transfer   nightmare or a piece of cake? wasn’t changed – the answer is found. usually the length should be 1-2 sentances, at least 60 characters long. If none or several sentances found – skip this step.
  2. Try to do the same as in (1), but remove all html tags from the near text that surrounds the tag. If none found – skip this step.
  3. Try the adopt algorithm, which tries to find similar text in the new spec.
    a. Use steps (1) and (2), but desrease the length of the searching text in a loop until the sentance is found or the length is too short. The practical work showed that this number shouldn’t be less than 20.
    b. If steps (1) and (2), or (3a) found several sentances increase the length of the text to search until the text is found in the new spec or the upper limit (f.e. 140 chars) is reached. Use hints to find the best matching text.

Adopt algorithm could be used with both ignoring html tags and taking advantage of them. Algorithm is valid for specs written in plain text, html or xml.

This algorithm was implemented in JLS2->JLS3 markup transfer tool. 84% of the markup tags were transfered automatically. The rest of them were done manually.



, , , , , ,
       

    Assertions and markup

    book Assertions and markup

    It is very important to have a good process while writing the test suite. I will talk about the one that was used for JLS.

    As mentioned before the final product is the number of tests. There is a relation between the tests and the specification. The assertion-driven process gives an idea of what each group of tests actually checks in the specification. Using this relation the developer can calculate the coverage, get the list of assertions on which the tests were not written, etc.

    Assertion is a statement from a specification that can be tested. And the first step is to identify all assertions in the specification. After that the developer can write tests.

    Example of assertions from the Java Language Specification:smt Assertions and markup

    • A compile-time error occurs if the same modifier appears more than once in an interface declaration.
    • The binary name of a member type consists of the binary name of its immediately enclosing type, followed by $, followed by the simple name of the member.
    • A continue statement may occur only in a while, do, or for statement.

    There could be many statements that are non-testable or involve uncertainty. Sometimes such statements include words like “possible” or “maybe”. It is not true that if a sentence has a word “may” it is not testable, but usually it is so.

    Examples of non-testable statements:

    • We do not recommend such “mixed notation” for array declarations.
    • Situations where the class of an object is not statically known may lead to run-time type errors.
    • If, however, evaluation of an expression throws an exception, then the expression is said to complete abruptly.

    There are many discussions and disputes about assertions. Some say that examples should not be treated as assertions. Others say that each statement is an assertions and there are two kinds of them: testable and non-testable. My personal opinion is that an assertion is certainly something testable. And in most cases examples are assertions just because the test can be written checking the particular example.

    The process of identifying assertions in the specification is called markup. There are many approaches. But in any case the user must be  able to get information on whether the statement is an assertion and somehow distinguish one assertion from another. There could be a separate repository with mapping of assertions and their id’s to statements. I like the idea of integrating the markup into the specification. This approach was chosen for the language area of the Java SE test suite. The JLS was written in FrameMaker. With export mechanisms the PDF and HTML versions were created. The html version was used during the creation of the test suite.

    In JLS and JLS 2 some special anchors identified the beginning and the end of an assertion. Additional information was the assertionID and short summary of the statement. The end anchor was an image and a link to the test. The html view and the code view are shown on the corresponding illustrations. The assertion id’s are arr033, arr034, arr020, etc.

    JLC2 html1 Assertions and markup

    JLC2 html code1 Assertions and markup

    The general idea can be described as:

    <a name=assertionID><!– shord description as html comment –>
    assertion statement here
    <img src=”pics/assert.gif”><a href=”path to test”>test ID which is the same as assertion ID</a>

    If seperate statements in different parts of specification are tested by one test the first tag will be something like arr033_0, arr033_1, arr033_2.

    This kind of architecture was used for JLS and JLS 2. It was slightly modified for JLS3, but the main idea was kept. I know some examples of approaches with non-static assertion IDs kept in a separate repository, where ID is some hash-value calculated based on the content. For several reasons it showed up to be not a very good solution. There is always a hard process migrating to the new version of the specification. But in my opinion it is much easier with the static ID’s embedded into the specification.



    , , , , , , , , , , , , , , , , , ,
         

      99% – is it enough or not?

      99%Today is a great day. I will try to explain why. As I mentioned in my intro-post our team is creating several different TCK’s. The area that I work on is so-called LANG – I develop tests for Java Language. Long ago, more than 2 years from now, we started to work on JLS 3 specification. We had to solve many problems which often occur during spec change (I promise to write more about that). Our team is finishing JCK 6a, lang tests is part of this JCK. Today I run the coverage scripts and we can finally say that we have 99% assertion coverage for JLS 3. To be more precise we have 99.4%. It means that we wrote tests for 99% of sentances in JLS 3 that we had marked as potentially testable. Isn’t it cool? I bet it is!

      The work is certainly not over yet and will not be so - there are many reasons why more tests are needed :

      • depth coverage improvement – more tests for several assertions are needed;
      • there are sentances that are testable, but for several reasons we hadn’t marked them as potentially testable;
      • there will be JLS 4 soon, we should start working on it as soon as possible.

      Different people might have opposite answers for a question in a title.  Most would say "Yes, of course". Indeed 99% is almost 100%. And what is 100% – it is a perfection. 99% looks great, and it is great. But we must understand what this number stands for, and what can be improved. My opinion is "yes, it is great, colossal, tremendous; but no, it’s not enough, I want more, even more than 100%", that’s why I plan to create a script for depth coverage calculation.

      Thanks to all SUN developers who did JCK-Lang work, thanks to people who helped (especially to compiler team) and certainly great thanks to all developers who use Java :-)

      Java world became even more compatible and safer!



      , , , , , , , , ,