Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8FA145A8C for ; Tue, 10 May 2011 12:59:16 +0000 (UTC) Received: (qmail 83850 invoked by uid 500); 10 May 2011 12:59:14 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 83800 invoked by uid 500); 10 May 2011 12:59:14 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 83791 invoked by uid 99); 10 May 2011 12:59:14 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 May 2011 12:59:14 +0000 X-ASF-Spam-Status: No, hits=-1.6 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [207.126.144.132] (HELO eu1sys200aob112.obsmtp.com) (207.126.144.132) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 May 2011 12:59:06 +0000 Received: from proxy03.detica.com ([193.36.230.103]) (using TLSv1) by eu1sys200aob112.postini.com ([207.126.147.11]) with SMTP ID DSNKTck2gTxOV7DWy0C/Cv89pDuYm44SbBnB@postini.com; Tue, 10 May 2011 12:58:44 UTC Received: from blackex04.detica.com ([10.1.1.9]) by proxy03.detica.com with Microsoft SMTPSVC(6.0.3790.4675); Tue, 10 May 2011 13:58:35 +0100 Received: from uksrpblkexb01.detica.com ([10.1.1.38]) by blackex04.detica.com with Microsoft SMTPSVC(6.0.3790.4675); Tue, 10 May 2011 13:58:41 +0100 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: SpanNearQuery - inOrder parameter Date: Tue, 10 May 2011 13:58:40 +0100 Message-ID: <614C529D389A5944B351F7DFB7594F240121F971@uksrpblkexb01.detica.com> In-Reply-To: <614C529D389A5944B351F7DFB7594F240121F73B@uksrpblkexb01.detica.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: SpanNearQuery - inOrder parameter Thread-Index: AcwOPEbkaZW7+4fxTMa8XpOLIMqV8wAAHD7gADVO2HA= References: <614C529D389A5944B351F7DFB7594F240121F737@uksrpblkexb01.detica.com> <614C529D389A5944B351F7DFB7594F240121F73B@uksrpblkexb01.detica.com> From: "Gregory Tarr" To: X-OriginalArrivalTime: 10 May 2011 12:58:41.0514 (UTC) FILETIME=[FCAFC4A0:01CC0F11] Anyone able to help me with the problem below? Thanks Greg=20 -----Original Message----- From: Gregory Tarr [mailto:Gregory.tarr@detica.com]=20 Sent: 09 May 2011 12:33 To: java-user@lucene.apache.org Subject: RE: SpanNearQuery - inOrder parameter Attachment didn't work - test below: =20 import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.TopDocsCollector; import org.apache.lucene.search.TopScoreDocCollector; import org.apache.lucene.search.spans.SpanNearQuery; import org.apache.lucene.search.spans.SpanQuery; import org.apache.lucene.search.spans.SpanTermQuery; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; import org.junit.Assert; import org.junit.Test; =20 public class TestSpanNearQueryInOrder { =20 @Test public void testSpanNearQueryInOrder() { RAMDirectory directory =3D new RAMDirectory(); IndexWriter writer =3D new IndexWriter(directory, new StandardAnalyzer(Version.LUCENE_29), true, IndexWriter.MaxFieldLength.UNLIMITED); TopDocsCollector collector =3D TopScoreDocCollector.create(3, false); =20 Document doc =3D new Document(); =20 // DOC1 doc.add(new Field("text","dddd aaaa bbbb cccc", Field.Store.YES, Field.Index.ANALYZED)); =20 writer.addDocument(doc); doc =3D new Document();=20 =20 // DOC2 doc.add(new Field("text","dddd aaaa aaaa cccc")); =20 writer.addDocument(doc); doc =3D new Document(); =20 // DOC3 doc.add(new Field("text","dddd aaaa yyyy aaaa xxxx cccc")); =20 writer.addDocument(doc); writer.optimize(); writer.close(); =20 searcher =3D new IndexSearcher(directory, false); =20 SpanQuery[] clauses =3D new SpanQuery[2]; clauses[0] =3D new SpanTermQuery(new Term("text", "aaaa")); clauses[1] =3D new SpanTermQuery(new Term("text", "aaaa")); =20 // Don't care about order, so setting inOrder =3D false SpanNearQuery = q =3D new SpanNearQuery(clauses, 1, false); searcher.search(q, = collector); =20 // This assert fails - 3 docs are returned. Expecting only DOC2 and DOC3 Assert.assertEquals("Check 2 results", 2, collector.getTotalHits());=20 =20 collector =3D new TopScoreDocCollector.create(3, false); clauses =3D = new SpanQuery[2]; clauses[0] =3D new SpanTermQuery(new Term("text", = "aaaa")); clauses[1] =3D new SpanTermQuery(new Term("text", "aaaa")); =20 // Don't care about order, so setting inOrder =3D false q =3D new SpanNearQuery(clauses, 0, false); searcher.search(q, collector); =20 // This assert fails - 3 docs are returned. Expecting only DOC2 Assert.assertEquals("Check 1 result", 1, collector.getTotalHits()); } =20 } ________________________________ From: Gregory Tarr [mailto:Gregory.tarr@detica.com] Sent: 09 May 2011 12:29 To: java-user@lucene.apache.org Subject: SpanNearQuery - inOrder parameter I attach a junit test which shows strange behaviour of the inOrder parameter on the SpanNearQuery constructor, using Lucene 2.9.4. My understanding of this parameter is that true forces the order and false doesn't care about the order.=20 Using true always works. However using false works fine when the terms in the query are distinct, but if they are equivalent, e.g. searching for "john john", I do not get the expected results. The workaround seems to be to always use true for queries with repeated terms. Any help?=20 Thanks=20 Greg=20 <>=20 Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies within the Detica Limited group of companies. Detica Limited is registered in England under No: 1337451. Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP, England. Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies within the Detica Limited group of companies. Detica Limited is registered in England under No: 1337451. Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP, England. Please consider the environment before printing this email. This message should be regarded as confidential. If you have received thi= s email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard cop= y by an authorised signatory. The contents of this email may relate to d= ealings with other companies within the Detica Limited group of companies= =2E Detica Limited is registered in England under No: 1337451. Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP, Eng= land. =0D --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org