lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven A Rowe <sar...@syr.edu>
Subject RE: N-grams with numbers and Shinglefilters
Date Mon, 02 Mar 2009 16:23:21 GMT
Hi Raymond,

On 3/2/2009 at 10:09 AM, Raymond Balm├Ęs wrote:
> suppose I have a tri-gram, what I want to do is index the tri-gram
> "string digit1 digit2" as one indexing phrase, and not index each token
> separately.

As long as you don't want any transformation performed on the phrase or its components, you
can add your phrase as a "keyword", i.e. a non-analyzed string that will be indexed as-is.

Unless your phrase field will be the only field on this document (pretty unlikely), you'll
want to use PerFieldAnalyzerWrapper[1] over KeywordAnalyzer[2] for the phrase field, and whatever
other analyzer you like for the other document field(s).

AFAICT, you don't need ShingleFilter.

Steve

[1] PerFieldAnalyzerWrapper:  http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/analysis/PerFieldAnalyzerWrapper.html
[2] KeywordAnalyzer: http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/analysis/KeywordAnalyzer.html


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message