lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 27423] New: - Demo HTML parser does not properly handle meta tag attributes.
Date Thu, 04 Mar 2004 00:05:06 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=27423>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=27423

Demo HTML parser does not properly handle meta tag attributes.

           Summary: Demo HTML parser does not properly handle meta tag
                    attributes.
           Product: Lucene
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: Normal
          Priority: Other
         Component: Examples
        AssignedTo: lucene-dev@jakarta.apache.org
        ReportedBy: matt@sidefx.com


Version 1.3final.

The meta tag parsing in the demo HTML parser
(demo/org/apache/lucene/demo/html/HTMLParser.jj) incorrectly relies on the meta
tag's "name" attribute coming before its "content" attribute. In XML/HTML,
attribute order is supposed to be insignificant.

So, if I have tags:

<meta content="blah" name="blarg" />
<meta content="gluh" name="glarg" />

...the parser will not parse them correctly. (In fact, it will simply fill in
name/content pairs as it encounters attributes in the stream, without regard to
which meta tags the attributes are actually in. So, in the above example, I will
get one meta property of "blarg"="gluh".)

This is a problem because my XSLT happens to result in meta tags with attributes
in the above order.

It may not seem like a big deal since it's in demo code, but because
HTMLParser.jj is many times faster than more heavy-weight solutions, I'd love
for this to be fixed, if possible.

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message