Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 3557 invoked from network); 1 Aug 2006 19:18:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 1 Aug 2006 19:18:47 -0000 Received: (qmail 98592 invoked by uid 500); 1 Aug 2006 19:18:41 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 98559 invoked by uid 500); 1 Aug 2006 19:18:40 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 98548 invoked by uid 99); 1 Aug 2006 19:18:40 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Aug 2006 12:18:40 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [69.55.225.129] (HELO ehatchersolutions.com) (69.55.225.129) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Aug 2006 12:18:39 -0700 Received: by ehatchersolutions.com (Postfix, from userid 504) id C5D2B30EFCD5; Tue, 1 Aug 2006 15:18:18 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on javelina X-Spam-Level: Received: from [128.143.193.79] (d-128-193-79.bootp.Virginia.EDU [128.143.193.79]) by ehatchersolutions.com (Postfix) with ESMTP id 026BA30EFCD5 for ; Tue, 1 Aug 2006 15:18:16 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v752.2) In-Reply-To: <5a8b64620608011019r18eb8d56t56fa59f7748329f5@mail.gmail.com> References: <5a8b64620607311538r71fd5672n76117df2ca246178@mail.gmail.com> <5a8b64620608011019r18eb8d56t56fa59f7748329f5@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <4EF6AEA9-43B2-44FE-B101-330850ACE24C@ehatchersolutions.com> Content-Transfer-Encoding: 7bit From: Erik Hatcher Subject: Re: Search matching Date: Tue, 1 Aug 2006 15:18:11 -0400 To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.752.2) X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.1.1 X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Rajiv, Have a look at the details provided by IndexSearcher.explain() for those documents, and you'll get some insight into the factors used to rank them. Since both scores are 1.0, you'll probably want to implement your own custom Similarity and override the lengthNorm() to adjust that factor. Another technique you can use is to expand a users query into a more sophisticated boolean query, such that a users query for "new york ny" would become (in Query.toString format): +new +york +ny "new york ny", which would boost exact matches. Erik On Aug 1, 2006, at 1:19 PM, Rajiv Roopan wrote: > Ok, this is how I'm indexing. Both in indexing and searching I'm using > SimpleAnalyzer() > > String loc = "New York, NY"; > doc.add(new Field("location", loc, Field.Store.NO, > Field.Index.TOKENIZED)); > > String loc2 = "New York Mills, NY"; > doc.add(new Field("location", loc2, Field.Store.NO, > Field.Index.TOKENIZED > )); > > > and this is how I'm searching... > > String searchStr = "New York, NY"; > Analyzer analyzer = new SimpleAnalyzer(); > QueryParser parser = new QueryParser("location", analyzer); > parser.setDefaultOperator(QueryParser.AND_OPERATOR); > Query query = parser.parse( searchStr ); > > Hits hits = searcher.search( query ); > > I've tried all query types and everytime "new york mills, ny" is in > hits(0). > Both results have a score of 1.0. I know I can add some kind of > sort to > always make the shorter field first. But shouldn't the first by > default, due > to the scoring algorithm, be "new york, ny" because it's a shorter > field? > > let me know if i'm missing something. thanks! > > rajiv > > On 8/1/06, Simon Willnauer wrote: >> >> I guess so, but without any information about your code nobody can >> tell >> what. >> If you provide more information you willl get help!! >> >> regards simon >> >> On 8/1/06, Rajiv Roopan wrote: >> > Hello, I have an index of locations for example. I'm indexing >> one field >> > using SimpleAnalyzer. >> > >> > doc1: albany ny >> > doc2: hudson ny >> > doc3: new york ny >> > doc4: new york mills ny >> > >> > when I search for "new york ny" , the first result returned is >> always >> "new >> > york mills ny". Am I doing something incorrect? >> > >> > thanks in advance, >> > rajiv >> > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org