lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshum <ansh...@gmail.com>
Subject Re: IDF scoring issue
Date Wed, 17 Dec 2008 05:04:19 GMT
Hi Rajiv,
If 'm interpreting your problem correctly, I'd suggest you to try using a
phraseQuery with an appropriate slop value. Though again it depends on what
is it that you exactly are trying to fetch.

--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Wed, Dec 17, 2008 at 9:13 AM, Rajiv2 <rajiv.roopan@gmail.com> wrote:

>
> To answer your questions,
> 1. there are only two words in the document I'm searching -- city and state
> abbrev. lowercased and analyzed by whitespaceanalyzer
> 2. the only field and default field is text, so the query becomes text:
> fleming text:roofing txt:inc. ...etc.
>   Using query operator AND instead of OR gives me no results which does not
> help.
> 3. I've been using explain in Luke and the only difference between "fleming
> ga" and "marietta ga" is the idf value is higher for "flemming" ... that's
> why "fleming ga" has a higher score.
>
> Basically i'm just trying to get the "marietta ga" doc to score higher. In
> the query text the two words are closer together than "fleming" and "ga".
>
> rajiv
>
>
>
> Erick Erickson wrote:
> >
> > Note a couple of things:
> >
> > 1> how a doc scores also takes into account how many other words
> >      are in the field you're querying on.
> > 2> Is "text" your default field? Because what you posted is really
> >      searching text:fleming <default field>:roofing <default
> > field>:inc......
> >      Not also the implicit OR between each of them. Is this really your
> >      intent?
> > 3> query.explain (as i remember) is your friend to figure out how the
> >      weights are being calculated. If you haven't got a copy of Luke, I'd
> >     *strongly* advise getting one and looking at the "explain" tab...
> >
> > Best
> > Erick
> >
> > On Tue, Dec 16, 2008 at 8:19 PM, Rajiv2 <rajiv.roopan@gmail.com> wrote:
> >
> >>
> >> Hello,
> >>
> >> I'm using the default lucene Queryparser on the search text : fleming
> >> roofing inc., marietta ga
> >>
> >> These items are in my index.
> >>
> >> doc 1: fleming ga
> >> doc 2: marietta ga
> >> doc 3: marietta il
> >> doc 4: marietta ok
> >> doc 5: marietta ok
> >> doc 6: fleming pa
> >>
> >> The first match is always "fleming ga" even though "marietta ga" is
> >> closer
> >> together in the search text. I'm assuming this is because of the
> >> "fleming"
> >> has a higher idf than marietta. What should I change in the way i'm
> >> querying
> >> or indexing to make this happen?
> >>
> >> Also, I don't want to modify the search text by putting quotes around
> >> "marietta ga" which forces the query parser to make a phrase query.
> >>
> >> thanks,
> >> Rajiv
> >> --
> >> View this message in context:
> >> http://www.nabble.com/IDF-scoring-issue-tp21045385p21045385.html
> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/IDF-scoring-issue-tp21045385p21046615.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message