lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uwe Schindler <>
Subject Re: Removing Empty Shingles in Lucene 4
Date Thu, 01 Nov 2012 19:51:01 GMT
The filter is still there. In Lucene 4.0 all tokenstream implementations are in a separate
module, no longer in Lucene core. The package names of most analysis components changed, too.
Use your IDE to find it or ask Google...


"Igal @" <> schrieb:

>I'm trying to migrate to Lucene 4.
>in Lucene 3.5 I extended
>and overrode accept() to remove undesired shingles.  in Lucene 4 
>org.apache.lucene.analysis.FilteringTokenFilter does not exist?
>I'm trying to achieve two things:
>1) remove shingles that have an empty item.
>2) remove shingles when the phrase contains a comma, for example:
>    for the phrase:    "delicious red apples, green pears, and oranges"
>I want the following shingles (with a shingle size of 2):
>"delicious red", "red apples", "green pears", "and oranges"
>(no "apples green" because there's a comma)
>(no "pears and" because there's a comma)
>any ideas?
>To unsubscribe, e-mail:
>For additional commands, e-mail:

Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message