lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Issues with Special Characters
Date Tue, 16 Sep 2008 13:45:29 GMT
You can easily answer the questions about what WhitespaceTokenizer
produces by getting a copy of Luke and looking at your index. Or writing
a really simple test program that prints out tokens.

At the bottom of this page is a list of special characters for escaping:
http://lucene.apache.org/java/docs/queryparsersyntax.html

Best
Erick

On Tue, Sep 16, 2008 at 9:05 AM, miztaken <justjunktome@gmail.com> wrote:

>
> Hi there,
> I am using WhiteSpaceAnalyser to index documents. I have used this because
> i
> need to split tokens based on space only. Also Tokensized=true
> While indexing what does it do with special characters like + - && || ! ( )
> { } [ ] ^ " ~ * ? : \, will these characters be indexed or will be chopped
> off? I am confused about this.
>
> Now i am having problem while searching as well..
> for query strings like "jason dartling (e-mail)" and "re: fyi.dat", i don't
> have to escape the special characters ( , ) and : but for input such as
> "re:" queryParser is producing error so i have escaped characters here.
> So it seems like i have two cases to deal with..
> Can anyone suggest me one generic way to deal with both the cases?
>
> Basically how to index and search string with escape characters will be my
> generalized question?
>
>
> Please help me
> miztaken
>
>
>
>
>
> --
> View this message in context:
> http://www.nabble.com/Issues-with-Special-Characters-tp19511428p19511428.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message