lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: IDF scoring issue
Date Wed, 17 Dec 2008 02:48:36 GMT
Note a couple of things:

1> how a doc scores also takes into account how many other words
     are in the field you're querying on.
2> Is "text" your default field? Because what you posted is really
     searching text:fleming <default field>:roofing <default
field>:inc......
     Not also the implicit OR between each of them. Is this really your
     intent?
3> query.explain (as i remember) is your friend to figure out how the
     weights are being calculated. If you haven't got a copy of Luke, I'd
    *strongly* advise getting one and looking at the "explain" tab...

Best
Erick

On Tue, Dec 16, 2008 at 8:19 PM, Rajiv2 <rajiv.roopan@gmail.com> wrote:

>
> Hello,
>
> I'm using the default lucene Queryparser on the search text : fleming
> roofing inc., marietta ga
>
> These items are in my index.
>
> doc 1: fleming ga
> doc 2: marietta ga
> doc 3: marietta il
> doc 4: marietta ok
> doc 5: marietta ok
> doc 6: fleming pa
>
> The first match is always "fleming ga" even though "marietta ga" is closer
> together in the search text. I'm assuming this is because of the "fleming"
> has a higher idf than marietta. What should I change in the way i'm
> querying
> or indexing to make this happen?
>
> Also, I don't want to modify the search text by putting quotes around
> "marietta ga" which forces the query parser to make a phrase query.
>
> thanks,
> Rajiv
> --
> View this message in context:
> http://www.nabble.com/IDF-scoring-issue-tp21045385p21045385.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message