lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Watkins <>
Subject Re: html parsers and numers of terms
Date Tue, 13 Dec 2005 16:43:18 GMT
So obvious I missed it (at least that's my excuse). I'm on the road at
the moment and -- can you believe it? -- didn't bring my copy of Lucene
In Action with me! Looks like I'll have to get the source code from to crib the analyzer demo code.

Much obliged,
-- Robert

On Tue, 13 Dec 2005, Erik Hatcher wrote:

> How about taking a single simple HTML file, running it through each parser, 
> dumping the tokens into separate collections (or output to a single text 
> file) and diff them?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message