Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@www.apache.org Received: (qmail 49339 invoked from network); 4 Mar 2004 00:04:50 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 4 Mar 2004 00:04:50 -0000 Received: (qmail 69349 invoked by uid 500); 4 Mar 2004 00:04:32 -0000 Delivered-To: apmail-jakarta-lucene-dev-archive@jakarta.apache.org Received: (qmail 69336 invoked by uid 500); 4 Mar 2004 00:04:32 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 69323 invoked from network); 4 Mar 2004 00:04:32 -0000 Received: from unknown (HELO exchange.sun.com) (192.18.33.10) by daedalus.apache.org with SMTP; 4 Mar 2004 00:04:31 -0000 Received: (qmail 24436 invoked by uid 50); 4 Mar 2004 00:05:06 -0000 Date: 4 Mar 2004 00:05:06 -0000 Message-ID: <20040304000506.24435.qmail@nagoya.betaversion.org> From: bugzilla@apache.org To: lucene-dev@jakarta.apache.org Cc: Subject: DO NOT REPLY [Bug 27423] New: - Demo HTML parser does not properly handle meta tag attributes. X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT . ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=27423 Demo HTML parser does not properly handle meta tag attributes. Summary: Demo HTML parser does not properly handle meta tag attributes. Product: Lucene Version: unspecified Platform: All OS/Version: All Status: NEW Severity: Normal Priority: Other Component: Examples AssignedTo: lucene-dev@jakarta.apache.org ReportedBy: matt@sidefx.com Version 1.3final. The meta tag parsing in the demo HTML parser (demo/org/apache/lucene/demo/html/HTMLParser.jj) incorrectly relies on the meta tag's "name" attribute coming before its "content" attribute. In XML/HTML, attribute order is supposed to be insignificant. So, if I have tags: ...the parser will not parse them correctly. (In fact, it will simply fill in name/content pairs as it encounters attributes in the stream, without regard to which meta tags the attributes are actually in. So, in the above example, I will get one meta property of "blarg"="gluh".) This is a problem because my XSLT happens to result in meta tags with attributes in the above order. It may not seem like a big deal since it's in demo code, but because HTMLParser.jj is many times faster than more heavy-weight solutions, I'd love for this to be fixed, if possible. --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org