Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 42708 invoked from network); 17 Dec 2008 03:44:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 Dec 2008 03:44:20 -0000 Received: (qmail 48838 invoked by uid 500); 17 Dec 2008 03:44:25 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 48803 invoked by uid 500); 17 Dec 2008 03:44:25 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 48792 invoked by uid 99); 17 Dec 2008 03:44:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Dec 2008 19:44:24 -0800 X-ASF-Spam-Status: No, hits=2.6 required=10.0 tests=DNS_FROM_OPENWHOIS,SPF_HELO_PASS,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Dec 2008 03:44:04 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1LCnK3-0001bC-7E for java-user@lucene.apache.org; Tue, 16 Dec 2008 19:43:43 -0800 Message-ID: <21046615.post@talk.nabble.com> Date: Tue, 16 Dec 2008 19:43:43 -0800 (PST) From: Rajiv2 To: java-user@lucene.apache.org Subject: Re: IDF scoring issue In-Reply-To: <359a92830812161848wf8a5e49va366aed2f50b3c9a@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: rajiv.roopan@gmail.com References: <21045385.post@talk.nabble.com> <359a92830812161848wf8a5e49va366aed2f50b3c9a@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org To answer your questions, 1. there are only two words in the document I'm searching -- city and state abbrev. lowercased and analyzed by whitespaceanalyzer 2. the only field and default field is text, so the query becomes text: fleming text:roofing txt:inc. ...etc. Using query operator AND instead of OR gives me no results which does not help. 3. I've been using explain in Luke and the only difference between "fleming ga" and "marietta ga" is the idf value is higher for "flemming" ... that's why "fleming ga" has a higher score. Basically i'm just trying to get the "marietta ga" doc to score higher. In the query text the two words are closer together than "fleming" and "ga". rajiv Erick Erickson wrote: > > Note a couple of things: > > 1> how a doc scores also takes into account how many other words > are in the field you're querying on. > 2> Is "text" your default field? Because what you posted is really > searching text:fleming :roofing field>:inc...... > Not also the implicit OR between each of them. Is this really your > intent? > 3> query.explain (as i remember) is your friend to figure out how the > weights are being calculated. If you haven't got a copy of Luke, I'd > *strongly* advise getting one and looking at the "explain" tab... > > Best > Erick > > On Tue, Dec 16, 2008 at 8:19 PM, Rajiv2 wrote: > >> >> Hello, >> >> I'm using the default lucene Queryparser on the search text : fleming >> roofing inc., marietta ga >> >> These items are in my index. >> >> doc 1: fleming ga >> doc 2: marietta ga >> doc 3: marietta il >> doc 4: marietta ok >> doc 5: marietta ok >> doc 6: fleming pa >> >> The first match is always "fleming ga" even though "marietta ga" is >> closer >> together in the search text. I'm assuming this is because of the >> "fleming" >> has a higher idf than marietta. What should I change in the way i'm >> querying >> or indexing to make this happen? >> >> Also, I don't want to modify the search text by putting quotes around >> "marietta ga" which forces the query parser to make a phrase query. >> >> thanks, >> Rajiv >> -- >> View this message in context: >> http://www.nabble.com/IDF-scoring-issue-tp21045385p21045385.html >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> > > -- View this message in context: http://www.nabble.com/IDF-scoring-issue-tp21045385p21046615.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org