lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: cvs commit:jakarta-lucene/src/java/org/apache/lucene/analysis/standardStandardTokenizer.java Standar,,,
Date Wed, 24 Dec 2003 22:37:55 GMT
Dan - I moderated your message into the list as you were not subscribed 
from the address you sent with.

More below...

On Dec 24, 2003, at 2:14 PM, danrapp@comcast.net wrote:
> I haven't had a chance to look into this very much, but I'm guessing 
> its due to the fact that the ideogram sequence isn't tokenized when 
> added to the index (due to adding it as a keyword), but it is 
> tokenized when parsing the query...
>
> Any suggestions for working around this? I'd prefer to leave the 
> document.add(Field.Keyword(...)) pattern.

It has always been the case that tokenizing while parsing the query 
took place.  There are some tricks you can play though - like use the 
PerFieldAnalyzerWrapper and assigning a different analyzer to that 
particular field that uses something like the WhitespaceAnalyzer or a 
custom one that "tokenizes" the entire field as one chunk.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message