lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Watkins <rwatk...@foo-bar.org>
Subject Re: html parsers and numers of terms
Date Tue, 13 Dec 2005 16:43:18 GMT
So obvious I missed it (at least that's my excuse). I'm on the road at
the moment and -- can you believe it? -- didn't bring my copy of Lucene
In Action with me! Looks like I'll have to get the source code from
lucenebook.com to crib the analyzer demo code.

Much obliged,
-- Robert

On Tue, 13 Dec 2005, Erik Hatcher wrote:

> How about taking a single simple HTML file, running it through each parser, 
> dumping the tokens into separate collections (or output to a single text 
> file) and diff them?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message