Specification is important – this statement is clear to everyone. A widely used product, technology or language without a specification is useless. A specification without a testsuite is dangerous. A testsuite without markup and tests is impossible. This process is quite complex. However there are ways to simplify the markup stage.
As for Java Language Specification (JLS) and Java Virtual Machine Specification (JVMS) they are written in FrameMaker. Afterwards spec is exported to html and pdf. The markup is embedded into html version. My opinion is that markup information should be placed into (or connected with) the origin text. In our case it is FrameMaker document. I’m not sure that this is possible at all, but my guess it is. If not, maybe FrameMaker is not the best solution. As a result we will significantly reduce the amount of time and effort needed for transferring old markup and marking up new text. Moreover during writing the next revision of spec the auther together with tck team should markup all chenged and new assertions. I’d say the best way is when the spec writing and the markup processes are done at the same time. It is reasonable for the auther to point out the test developers what statements should be tested.
The simplest definition of metadata is that it is data about data. Metadata might be very useful. As for the markup there was some metadata embedded: id, small description of assertion, link to test. During the markup transfer I realized that more metadata would be very helpful. In the new version of specification there were several kinds of assertions:
old:
non-changed text, tests do not need any changes;
oldToBeChanged:
text changed, tests do need to be changed;
new:
totaly new text, new tests needed;
newWritten:
new text, but tests already exist (because the test development process began as soon as the draft spec was avaliable);
newWrittenToBeChanged:
new text, tests exist, draft spec changed, so the tests need to be changed or existing tests are not enough.
Adding this kind of data to the markup would greatly simplify the future work – the test development. Because just by looking at an assertion in the spec one can easily say if more tests are needed or several should be updated.
With the given markup architecture is was decided to use the title attribute in a-href tag (the second anchor). So the markup would look like:
<a name=assertionID><!– shord description as html comment –>
assertion statement here
<img src=”pics/assert.gif”><a href=”path to test” title=assertType>test ID which is the same as assertion ID</a>
The title attribute can be viewed in a browser as a hint.
The whole process of creating a markup and developing tests is time consuming. And when it seems that the work is done, a new version of spec is released. What happens next? Of course there is a need of a new version of the test suite. New tests must be written and the old ones updated or even deleted.
The best way to start is to do the markup. This task can be devided into two subtasks:
transfer old markup from previous spec to the new spec (this is needed because many tests were already written, they are linked to spec id’s, reusing all possible tests is a good idea);
markup new and updated assertions.
Transferring the markup is simple enough to do it by hand:
Find the markup tag in the old spec.
Find the best place to insert the tag in the new version of spec.
Insert the tag.
If there are only 10 assertions – this work is a piece of cake. But if there are thousands it is a hard job that should be automated. The hardest part is to find a new proper place for markup tags. It’s hard just because the spec was changed. For JLS2 to JLS3 migration process the flollowing alrorithm was used:
Each assertion is rounded with html anchors. Both of them should be transferred using such algorithm.
Hin 1t: if some tag is transfered, there is a great possibility, that next tag in old spec will be positioned after the one that is transfered.
Hint 2: algorithm should check that the second anchor should be positioned after the first one and not too far from it.
Look at the text before and after tag in old spec. Find it in the new spec. If one of them wasn’t changed – the answer is found. usually the length should be 1-2 sentances, at least 60 characters long. If none or several sentances found – skip this step.
Try to do the same as in (1), but remove all html tags from the near text that surrounds the tag. If none found – skip this step.
Try the adopt algorithm, which tries to find similar text in the new spec. a. Use steps (1) and (2), but desrease the length of the searching text in a loop until the sentance is found or the length is too short. The practical work showed that this number shouldn’t be less than 20. b. If steps (1) and (2), or (3a) found several sentances increase the length of the text to search until the text is found in the new spec or the upper limit (f.e. 140 chars) is reached. Use hints to find the best matching text.
Adopt algorithm could be used with both ignoring html tags and taking advantage of them. Algorithm is valid for specs written in plain text, html or xml.
This algorithm was implemented in JLS2->JLS3 markup transfer tool. 84% of the markup tags were transfered automatically. The rest of them were done manually.