lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Analysis
Date Tue, 01 Nov 2005 16:38:45 GMT
Not sure I am understanding your question correctly, but I think you 
want to pick your Analyzer based on what is in your content (i.e. 
language, usage of special symbols, etc.), not based on what the format 
of your content is (i.e. XML).

Malcolm wrote:

> Hi,
> I'm just asking for opinions on Analyzer's for the indexing. For 
> example Otis in his article uses the WhitespaceAnalyzer and the 
> Sandbox program uses the StandardAnalyzer.I am just gauging opinions 
> on the subject with regard to XML.
> I'm using a mix of the Sandbox XMLDocumentHandlerSAX and a bit extra. 
> I originally started using Digester but found that I preferred the 
> Sandbox implementation.
> Thanks,
> Malcolm Clark
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
337 Hinds Hall 
Syracuse, NY 13244 
Voice:  315-443-5484 
Fax: 315-443-6886 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message