From java-user-return-33447-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Thu Apr 03 10:04:33 2008 Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 73293 invoked from network); 3 Apr 2008 10:04:33 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 3 Apr 2008 10:04:33 -0000 Received: (qmail 73823 invoked by uid 500); 3 Apr 2008 10:04:14 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 73650 invoked by uid 500); 3 Apr 2008 10:04:14 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 73038 invoked by uid 99); 3 Apr 2008 10:04:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Apr 2008 03:04:11 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [62.213.161.134] (HELO pmx.sirma.bg) (62.213.161.134) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Apr 2008 10:03:21 +0000 X-Virus-Scanned: Sirma Antivirus System Received: from [192.168.128.140] (ivasilev.sirma.int [192.168.128.140]) by pmx.sirma.bg (Sirma mail system) with ESMTP id 1EB8B24002 for ; Thu, 3 Apr 2008 13:03:41 +0300 (EEST) Message-ID: <47F4AB7C.7090702@sirma.bg> Date: Thu, 03 Apr 2008 13:03:40 +0300 From: Ivan Vasilev User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: LUCENE MAIL LIST Subject: PhraseQuery little bug? Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org Hi Guys, I make the following test – I create 2 files. File1.txt with content: “apple 2 3 4 pear” And File2.txt with content: “pear 2 3 4 apple” I made the following searching tests: 1. Using Luke Search tab. 1.1. When searching for: content:"pear apple"~3 Then the File1.txt is returned. 1.2. When searching for: content:"pear apple"~4 Result is the same – File1.txt 1.3. When searching for: content:"pear apple"~5 Both File1.txt and File2.txt are returned. 2. Using simple app that uses the class PhraseQuery – the results are the same as in Test 1. PhraseQuery pq = new PhraseQuery(); pq.add(new Term("content", "apple")); pq.add(new Term("content", "pear")); 2.1. pq.setSlop(3); Then the File1.txt is returned. 2.2. pq.setSlop(4); Result is the same – File1.txt 2.3. pq.setSlop(5); Both File1.txt and File2.txt are returned. 3. Using simple app that uses the class SpanNearQuery – the results are now different (on my opinion these are the correct results): new SpanNearQuery(new SpanQuery[]{ new SpanTermQuery(new Term("content", "apple")), new SpanTermQuery(new Term("content", "pear"))}, 3, false); Both File1.txt and File2.txt are returned. When changing the slop from 3 to 2 (in the constructor) – no results found. When changing it to 4 – again 2 results are returned. When changing the inOrder boolean (in the constructor) from false to true – then of course File2.txt is not returned under any other conditions. Although this is not very important but I think PhraseQuery has a bit buggy behavior. When searching with slop equal to the number of words that are between the two searched words than the files that contain them IN THE SAME ORDER are returned. When slop is with 2 greater – ranging search words, as well as, in between words – then THE ORDER of the searched words does not matter. Best Regards, Ivan --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org