lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Escaping special characters
Date Sun, 23 Sep 2007 17:14:03 GMT
Did you use the same analyzer for both querying and indexing?
If so, which one?

I recommend you use Luke to examine the parsing for various
queries. Start by making sure WhitespaceAnalyzer is the analyzer
chosen on the "search" tab and go from there.

We'll talk past each other until you include some code snippets
I suspect. The important parts are:
1> what analyzer you use at INDEX time.
2> what analyzer you use at SEARCH time.
3> what is the query you submit?
4> what is returned from Query.toString().

A small, self-contained program with the *minimal* code
showing your problem would allow folks to give you much
better feedback...

Best
Erick

On 9/23/07, Tom Conlon <tomc@2ls.com> wrote:
>
> Thanks for the input Erick,
>
> I changed the analyser to WhitespaceAnalyzer but same results.
>
> You're right about what was indexed
>
> Enter Querystring:
> (c++ and c#)
> Searching for: c c
>
> Enter Querystring:
> (c\+\+ and c\#)
> Searching for: c c
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: 23 September 2007 14:47
> To: java-user@lucene.apache.org
> Subject: Re: Escaping special characters
>
> Default as in StandardAnalyzer? "Stuff happens" with StandardAnalyzer.
> I'd recommend that you use something really simple like
> WhitespaceAnalyzer for something like this, or possibly create your own
> analyzer that uses LowerCaseFilter and breaks on whitespace.
>
> I suspect that if you open your index with Luke, you'll find that what
> you actually indexed was "C" rather than "C++" etc. But that's only a
> guess.
>
> What does query.toString() return?
>
> Best
> Erick
>
> On 9/23/07, Tom Conlon <tomc@2ls.com> wrote:
> >
> > Hi Karl,
> >
> > > Did you use the same analyzer when populating the index as when you
> > create the query?
> >
> > Yes, I used the default analyzer in a modified version of the command
> > line demo.
> >
> > > If you did, can you demonstrate the problem with a small stand alone
> > test case?
> >
> > I'll try.
> >
> > Tom
> >
> > -----Original Message-----
> > From: Karl Wettin [mailto:karl.wettin@gmail.com]
> > Sent: 23 September 2007 10:39
> > To: java-user@lucene.apache.org
> > Subject: Re: Escaping special characters
> >
> >
> > 23 sep 2007 kl. 10.53 skrev Tom Conlon:
> >
> > >
> > > Unless I'm missing something, according to:
> > >
> > > http://lucene.apache.org/java/docs/queryparsersyntax.html#Escaping%
> > > 20Spe
> > > cial%20Characters
> > >
> > > I should be able to search for C++ and C# using something like:
> > > C\+\+ and C\#.
> >
> > That is correct.
> >
> > >
> > > This doesn't work.
> >
> > Did you use the same analyzer when populating the index as when you
> > create the query?
> >
> > If you did, can you demonstrate the problem with a small stand alone
> > test case?
> >
> >
> > --
> > karl
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message