lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Doing the lucene remove character \n (break line)
Date Mon, 24 Nov 2008 01:49:33 GMT
What I'd do is make my own filter, probably one based upon one of
the pre-existing ones and modify the call to nextToken, examine that
token, and if it ends in a hyphen get the next token and return the
concatenation of the two. I don't believe that there's a pre-existing
filter that does this, but you might want to check because I haven't
looked at them an a while.

Best
Erick

On Sun, Nov 23, 2008 at 4:40 PM, farnetani <farnetani@gmail.com> wrote:

>
> I need to do lucene find the sentence:
> ARLEI FERREIRA FARNETANI JUNIOR
> [arlei] [ferreira] [farnetani] [junior]    (1)
>
> and too:
>
> ARLEI FERREIRA FAR-   <break line>
> NETANI JUNIOR
>
> I'm using the Brazilian Analyzer, but the result is:
> [ARLEI] [FERREIRA] [FAR] [NETANI] [JUNIOR]
>
> I have to do that the lucene result:
> [ARLEI] [FERREIRA] [FARNETANI] [JUNIOR] equals the sentence (1)
>
> So I have to do that lucene remove "-" hyphen and the break line (\n).
>
> To remove character hyphen (-) I got, but remove the break line no!
>
> How do I do???
> --
> View this message in context:
> http://www.nabble.com/Doing-the-lucene-remove-character-%5Cn-%28break-line%29-tp20650540p20650540.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message