lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rutherglen <jason.rutherg...@gmail.com>
Subject Re: Analyzer for stripping non alpha-numeric characters?
Date Thu, 04 Feb 2010 23:00:18 GMT
Answering my own question... PatternReplaceFilter doesn't output
multiple tokens...

Which means messing with capture state...

On Thu, Feb 4, 2010 at 2:16 PM, Jason Rutherglen
<jason.rutherglen@gmail.com> wrote:
> Transferred partially to solr-user...
>
> Steven, thanks for the reply!
>
> I wonder if PatternReplaceFilter can output multiple tokens?  I'd like
> to progressively strip the non-alphanums, for example output:
>
> apple!&*
> apple!&
> apple!
> apple
>
> On Thu, Feb 4, 2010 at 12:18 PM, Steven A Rowe <sarowe@syr.edu> wrote:
>> Hi Jason,
>>
>> Solr's PatternReplaceFilter(ts, "\\P{Alnum}+$", "", false) should work, chained after
an appropriate tokenizer.
>>
>> Steve
>>
>> On 02/04/2010 at 12:18 PM, Jason Rutherglen wrote:
>>> Is there an analyzer that easily strips non alpha-numeric from the end
>>> of a token?
>>>
>>> --------------------------------------------------------------------- To
>>> unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For
>>> additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message