lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: using CharFilter to inject a space
Date Sat, 03 Nov 2012 23:42:52 GMT
On Sat, Nov 3, 2012 at 7:35 PM, Igal @ getRailo.org <igal@getrailo.org> wrote:
> hi,
>
> I want to make sure that every comma (,) and semi-colon (;) is followed by a
> space prior to tokenizing.
>
> the idea is to then use a WhitespaceTokenizer which will keep commas but
> still split the phrase in a case like:
>
>     "I bought red apples,green pears,and yellow oranges"
>
> I'm thinking of extending CharFilter to "inject" a space after the comma.
> my questions are:
>
>     1) does it make sense or am I completely off here?
>
>     2) are there any code examples of CharFilter implementations with
> injection of a char?

Can't you just use something like MappingCharFilter with a single
mapping of "," to ", " ?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message