lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: QueryParser returning TermQuery instead of PhraseQuery?
Date Tue, 21 Oct 2008 19:03:47 GMT
this can absolutely be done, so don't go off the deep end <G>.

Could we see the index snippet and the search snippet when
you tried with StandardAnalyzer?

Etick

On Tue, Oct 21, 2008 at 2:59 PM, samd <sdoyle_2@yahoo.com> wrote:

>
> I have tried doing both the indexing and parsing with the same parser being
> the StandardAnalyzer. I'm seeing the same result.
>
> I'm going to have to search based off the documentid by the looks of it if
> I
> can't do an exact search on a single term.
>
>
> Erick Erickson wrote:
> >
> > I'd really recommend you get a copy of Luke and examine what
> > your index really contains. You say:
> >
> > <<<This is one case where it is not although the same results are
> > produced with the Standard Analyzer.>>>
> >
> > StandardAnalyzer? When? If you *index* with SnowballAnalyzer, your
> > token would (probably) be "foo" rather than "foo1". So at *query* time,
> > it wouldn't make any difference whether you used StandardAnalyzer
> > or ShowballAnalyzer, since you dind't index the '1' in foo1. This is
> > mostly a guess though....
> >
> > StandardAnalyzer is, IMO, a bit of a misnomer. It tries to do more
> > than one thinks (e.g. handle e-mail addresses). This sometimes
> > produces surprising results.
> >
> > You'll find Luke (google lucene luke) is invaluable both for figuring
> > out what is in your index and what queries look like when processed
> > by various analyzers. And what documents get found. And why <G>.
> >
> > And I second Daniel's point. Why do you "sometimes" want exact
> > matches and sometimes want stemmed matches? You can't really
> > get both of those things out of one field in your documents. If you
> > stem at indexing time, the extra data is lost and you can't regain it.
> > If you don't stem at index time, you can't very well get stemmed
> > matches at query time.....
> >
> > Best
> > Erick
> >
> > On Tue, Oct 21, 2008 at 6:45 AM, samd <sdoyle_2@yahoo.com> wrote:
> >
> >>
> >> The Snowball Analyzer was chosen since there are cases where the
> stemming
> >> is
> >> desired. This is one case where it is not although the same results are
> >> produced with the Standard Analyzer. If this doesn't work I guess I'll
> >> probably need to try to programmatically provide an additional field to
> >> the
> >> search parameters in order to make the results unique.
> >>
> >>
> >> Daniel Noll-3 wrote:
> >> >
> >> > samd wrote:
> >> >> I have field for example say "foo" I need to match exactly foo but
> >> there
> >> >> is
> >> >> also another field for exampled called "foo1"
> >> >>
> >> >> What I want is a PhraseQuery so I surround foo with quotes before it
> >> gets
> >> >> passed to the QueryParser.parse method. However I get back a
> TermQuery
> >> >> and
> >> >> the values that match foo1 end up being returned in the results but
I
> >> >> need
> >> >> an exact match on foo.
> >> >
> >> > I'm not sure what you're trying to achieve, but:
> >> >
> >> >     1. Merely putting quotes around something doesn't make it a
> >> >        PhraseQuery, having more than one term inside the quotes makes
> >> >        something a phrase query.
> >> >
> >> >     2. Stop words sometimes removes a word such that what you thought
> >> >        was a two-term query is actually one.
> >> >
> >> >     3. Whether it's a PhraseQuery or TermQuery has no effect on the
> way
> >> >        it matches each individual term, i.e. it won't be any more or
> >> less
> >> >        "exact".
> >> >
> >> >> I don't want to have to have a special case for PhraseQuery where I
> >> need
> >> >> to
> >> >> bypass the parse method and manually construct this. Besides I'm not
> >> even
> >> >> sure if that will work.
> >> >
> >> > Even if it does "work" it won't change the semantics.
> >> >
> >> > This is sounding like an X-Y problem, so what are you actually trying
> >> to
> >> > achieve?  It sounds like you don't want stemming (talking about
> "exact"
> >> > matches) yet you chose the snowball analyser (whose sole purpose is
> >> > stemming, unless I am mistaken...)
> >> >
> >> > Daniel
> >> >
> >> >
> >> > --
> >> > Daniel Noll                            Forensic and eDiscovery
> Software
> >> > Senior Developer                              The world's most
> advanced
> >> > Nuix                                                email data
> analysis
> >> > http://nuix.com/                                and eDiscovery
> software
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >
> >> >
> >> >
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/QueryParser-returning-TermQuery-instead-of-PhraseQuery--tp20082388p20087679.html
> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/QueryParser-returning-TermQuery-instead-of-PhraseQuery--tp20082388p20097121.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message