lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Philosophy(??) question
Date Wed, 14 Jan 2004 08:49:52 GMT
On Jan 13, 2004, at 5:23 PM, Scott Smith wrote:
> Some day, I'd be interested to understand the "deeper question".

Here is a scenario to ponder about using different analyzers at index 
and query time.

Suppose you have a custom analyzer that places synonyms of words into 
the same token position as the original words.  QueryParser does not 
deal with token position, neither does PhraseQuery currently.  If you 
use the same analyzer for QueryParser, the query will be mangled.  
Using an analyzer that does everything the indexing analyzer does but 
without putting the synonyms into the token stream will do the trick 
(no need to look up synonyms at query time, they are already indexed 
anyway).

Also, if you use Field.Keyword at indexing time, perhaps having an 
analyzer at QueryParser time that does not "analyze" those keyword 
fields probably makes sense too.

	Erik


> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Sent: Tuesday, January 13, 2004 3:19 AM
> To: Lucene Users List
> Subject: Re: Philosophy(??) question
>
>
> On Jan 12, 2004, at 7:59 PM, Scott Smith wrote:
>> I have some documents I'm indexing which have multiple languages in
>> them
>> (i.e., some fields in the document are always English; other fields
>> may be
>> other languages).  Now, I understand why a query against a certain
>> field
>> must use the same analyzer as was used when that field was indexed
>> (stemming, stop words, etc.).  It seems like different fields could 
>> use
>> different analyzers and the world would still be a happy place.
>> However,
>> since the analyzer() is passed in as part of the IndexWriter, that
>> can't
>> happen.  Is there a way to do this (other than having multiple indexes
>> which
>> is a problem trying to do combined searches)?  Or am I missing
>> something
>> more subtle?  Sorry if I'm plowing old ground.
>
> The new PerFieldAnalyzerWrapper (in v. 1.3) allows you to specify
> different analyzers, as its name says, per field.  You simply specify
> which analyzer to use as a default and then any special ones for
> individual fields.
>
> As for using the same analyzer for querying as for indexing - that is a
> deeper question that I've yet to agree with.  There are some
> interesting reasons why you may want a different one - although they
> must "cooperate" in some fashion.
>
> 	Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message