lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: QueryParser returning TermQuery instead of PhraseQuery?
Date Tue, 21 Oct 2008 12:12:12 GMT
I'd really recommend you get a copy of Luke and examine what
your index really contains. You say:

<<<This is one case where it is not although the same results are
produced with the Standard Analyzer.>>>

StandardAnalyzer? When? If you *index* with SnowballAnalyzer, your
token would (probably) be "foo" rather than "foo1". So at *query* time,
it wouldn't make any difference whether you used StandardAnalyzer
or ShowballAnalyzer, since you dind't index the '1' in foo1. This is
mostly a guess though....

StandardAnalyzer is, IMO, a bit of a misnomer. It tries to do more
than one thinks (e.g. handle e-mail addresses). This sometimes
produces surprising results.

You'll find Luke (google lucene luke) is invaluable both for figuring
out what is in your index and what queries look like when processed
by various analyzers. And what documents get found. And why <G>.

And I second Daniel's point. Why do you "sometimes" want exact
matches and sometimes want stemmed matches? You can't really
get both of those things out of one field in your documents. If you
stem at indexing time, the extra data is lost and you can't regain it.
If you don't stem at index time, you can't very well get stemmed
matches at query time.....

Best
Erick

On Tue, Oct 21, 2008 at 6:45 AM, samd <sdoyle_2@yahoo.com> wrote:

>
> The Snowball Analyzer was chosen since there are cases where the stemming
> is
> desired. This is one case where it is not although the same results are
> produced with the Standard Analyzer. If this doesn't work I guess I'll
> probably need to try to programmatically provide an additional field to the
> search parameters in order to make the results unique.
>
>
> Daniel Noll-3 wrote:
> >
> > samd wrote:
> >> I have field for example say "foo" I need to match exactly foo but there
> >> is
> >> also another field for exampled called "foo1"
> >>
> >> What I want is a PhraseQuery so I surround foo with quotes before it
> gets
> >> passed to the QueryParser.parse method. However I get back a TermQuery
> >> and
> >> the values that match foo1 end up being returned in the results but I
> >> need
> >> an exact match on foo.
> >
> > I'm not sure what you're trying to achieve, but:
> >
> >     1. Merely putting quotes around something doesn't make it a
> >        PhraseQuery, having more than one term inside the quotes makes
> >        something a phrase query.
> >
> >     2. Stop words sometimes removes a word such that what you thought
> >        was a two-term query is actually one.
> >
> >     3. Whether it's a PhraseQuery or TermQuery has no effect on the way
> >        it matches each individual term, i.e. it won't be any more or less
> >        "exact".
> >
> >> I don't want to have to have a special case for PhraseQuery where I need
> >> to
> >> bypass the parse method and manually construct this. Besides I'm not
> even
> >> sure if that will work.
> >
> > Even if it does "work" it won't change the semantics.
> >
> > This is sounding like an X-Y problem, so what are you actually trying to
> > achieve?  It sounds like you don't want stemming (talking about "exact"
> > matches) yet you chose the snowball analyser (whose sole purpose is
> > stemming, unless I am mistaken...)
> >
> > Daniel
> >
> >
> > --
> > Daniel Noll                            Forensic and eDiscovery Software
> > Senior Developer                              The world's most advanced
> > Nuix                                                email data analysis
> > http://nuix.com/                                and eDiscovery software
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/QueryParser-returning-TermQuery-instead-of-PhraseQuery--tp20082388p20087679.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message