lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <DCutt...@grandcentral.com>
Subject RE: many analyzers, same index.
Date Fri, 19 Oct 2001 15:45:47 GMT
> From: Brandon Jockman [mailto:brandonj@isogen.com]
> 
> Is there anything wrong with using multiple analyzers on the 
> same index, (given of course that I am keeping the set of 
> documents for each mutually exclusive)?

This should work.  The primary risk is that they generate different terms
than the analyzer you use to tokenize queries.  So long as you're aware of
that, this shouldn't be a problem.

> I am hoping to use a single Lucene index in my application, 
> but some documents need to be analyzed differently.

Can you provide more details?  A common confusion is that folks try to do
format conversion in analyzers, e.g., to have different analyzers for PDF
files, for HTML files, etc.  The better approach is to implement converters
that convert these formats to plain text, either a String or a Reader.  Then
you can use the same analyzer for documents in different formats.

Doug

Mime
View raw message