lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uwe Schindler <...@thetaphi.de>
Subject Re: Removing Empty Shingles in Lucene 4
Date Thu, 01 Nov 2012 19:51:01 GMT
The filter is still there. In Lucene 4.0 all tokenstream implementations are in a separate
module, no longer in Lucene core. The package names of most analysis components changed, too.
Use your IDE to find it or ask Google...

Uwe



"Igal @ getRailo.org" <igal@getrailo.org> schrieb:

>hi,
>
>I'm trying to migrate to Lucene 4.
>
>in Lucene 3.5 I extended
>org.apache.lucene.analysis.FilteringTokenFilter 
>and overrode accept() to remove undesired shingles.  in Lucene 4 
>org.apache.lucene.analysis.FilteringTokenFilter does not exist?
>
>I'm trying to achieve two things:
>
>1) remove shingles that have an empty item.
>
>2) remove shingles when the phrase contains a comma, for example:
>
>    for the phrase:    "delicious red apples, green pears, and oranges"
>
>I want the following shingles (with a shingle size of 2):
>
>"delicious red", "red apples", "green pears", "and oranges"
>(no "apples green" because there's a comma)
>(no "pears and" because there's a comma)
>
>any ideas?
>
>TIA
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org

--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message