ant-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Rees <j...@alpsgiken.gr.jp>
Subject Re: The <javadoc> task doesn't support the nested "patternset" el ement
Date Mon, 29 Jul 2002 03:23:41 GMT
More talking to myself about XML parsing, well-formedness, and error
messages:

> I wrote
> 
> > With DOM, your program doesn't get a chance to start looking at the
> > semantics if the document is well-formed, 
> 
> I should have written
> 
> > With DOM, your program doesn't get a chance to start looking at the
> > semantics if the document is _not_ well-formed, 

I was thinking about this on the train home Friday night, and it
occurred to me that DOM implementations might also vary as to whether
they check for well-formedness before validating. One programmer may
think it's a great simplification to parse for well-formedness (probably
recursive-descent) while reading a document into memory. Since the
second parse against the DTD would (one might assume) be performed on
the memory-resident tree structure, it would not seem to really waste
that much time.

On the other hand, since a general XML parser, DOM or SAX, is not likely
to be validating with a recursive-descent parse, it may not be
unreasonable to run both concurrently. I haven't checked whether the DOM
spec requires well-formedness to be known before beginning validation.
(Probably won't check.)

The point being that 

> > <fileset dir="${dir.src}" />
> > <patternset refid="patternset.src" />
> > </fileset>

is not known to be _not_ _well-formed_ until the extra </fileset> is
read and parsed. Some parsers _may_ run the well-formedness check before
they validate (either against a DTD or by using program logic), but I
think that would be an implementation feature.

Since Ant uses SAX, it would simply not be reasonable to ask for the
lack of well-formedness to be reported before the artifact error of the
<patternset/> being a sibling node of the <fileset/>. That means that
there are two ways to point to the termination of the <fileset/> being
in the wrong place. One is to disallow empty <fileset/>s, which would be
a change to the (virtual) DTD and conflict with the design of Ant. The
other is to report the well-formedness error _after_ the improper
nesting error.

This is part of the nature of XML (and parsed declarations in general).
The only way to avoid it would be to abandon hand-written XML. Hmm.
Let's see. <tongue-in-cheek>The make syntax is pretty popular. Let's
define a new make syntax for Ant, and write a parser that converts that
to XML, so humans never have to see it.</tongue-in-cheek><wink/>

That said, it would be possible to hold off the error message about
elements being in the wrong place until after it is know that all their
sibling nodes are well-formed. But generalizing that brings us right back
to the issue of when to parse for well-formedness.

There. I've had my fix of pendantry for the day.

-- 
Joel Rees <joel@alpsgiken.gr.jp>


--
To unsubscribe, e-mail:   <mailto:ant-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:ant-user-help@jakarta.apache.org>


Mime
View raw message