lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Igal @ getRailo.org" <i...@getrailo.org>
Subject Re: Removing Empty Shingles in Lucene 4
Date Thu, 01 Nov 2012 20:00:43 GMT
thank you.  I found it at

     org.apache.lucene.analysis.util.FilteringTokenFilter


Igal


On 11/1/2012 12:51 PM, Uwe Schindler wrote:
> The filter is still there. In Lucene 4.0 all tokenstream implementations are in a separate
module, no longer in Lucene core. The package names of most analysis components changed, too.
> Use your IDE to find it or ask Google...
>
> Uwe
>
>
>
> "Igal @ getRailo.org" <igal@getrailo.org> schrieb:
>
>> hi,
>>
>> I'm trying to migrate to Lucene 4.
>>
>> in Lucene 3.5 I extended
>> org.apache.lucene.analysis.FilteringTokenFilter
>> and overrode accept() to remove undesired shingles.  in Lucene 4
>> org.apache.lucene.analysis.FilteringTokenFilter does not exist?
>>
>> I'm trying to achieve two things:
>>
>> 1) remove shingles that have an empty item.
>>
>> 2) remove shingles when the phrase contains a comma, for example:
>>
>>     for the phrase:    "delicious red apples, green pears, and oranges"
>>
>> I want the following shingles (with a shingle size of 2):
>>
>> "delicious red", "red apples", "green pears", "and oranges"
>> (no "apples green" because there's a comma)
>> (no "pears and" because there's a comma)
>>
>> any ideas?
>>
>> TIA
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
> --
> Uwe Schindler
> H.-H.-Meier-Allee 63, 28213 Bremen
> http://www.thetaphi.de


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message