lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Vasilev <ivasi...@sirma.bg>
Subject Re: PhraseQuery little bug?
Date Thu, 03 Apr 2008 10:18:04 GMT
Of cours in our system I can use SpanNearQuery instead of PhraseQuery.
My question is is there known performance differences between the two 
classes?



Ivan Vasilev wrote:
> Hi Guys,
>
> I make the following test – I create 2 files. File1.txt with content:
> “apple 2 3 4 pear”
> And File2.txt with content:
> “pear 2 3 4 apple”
>
> I made the following searching tests:
> 1. Using Luke Search tab.
>
> 1.1. When searching for:
> content:"pear apple"~3
> Then the File1.txt is returned.
>
> 1.2. When searching for:
> content:"pear apple"~4
> Result is the same – File1.txt
>
> 1.3. When searching for:
> content:"pear apple"~5
> Both File1.txt and File2.txt are returned.
>
> 2. Using simple app that uses the class PhraseQuery – the results are 
> the same as in Test 1.
> PhraseQuery pq = new PhraseQuery();
> pq.add(new Term("content", "apple"));
> pq.add(new Term("content", "pear"));
>
> 2.1. pq.setSlop(3);
> Then the File1.txt is returned.
>
> 2.2. pq.setSlop(4);
> Result is the same – File1.txt
>
> 2.3. pq.setSlop(5);
> Both File1.txt and File2.txt are returned.
>
> 3. Using simple app that uses the class SpanNearQuery – the results 
> are now different (on my opinion these are the correct results):
> new SpanNearQuery(new SpanQuery[]{
> new SpanTermQuery(new Term("content", "apple")),
> new SpanTermQuery(new Term("content", "pear"))}, 3, false);
>
> Both File1.txt and File2.txt are returned.
> When changing the slop from 3 to 2 (in the constructor) – no results 
> found.
>
> When changing it to 4 – again 2 results are returned.
> When changing the inOrder boolean (in the constructor) from false to 
> true – then of course File2.txt is not returned under any other 
> conditions.
>
> Although this is not very important but I think PhraseQuery has a bit 
> buggy behavior. When searching with slop equal to the number of words 
> that are between the two searched words than the files that contain 
> them IN THE SAME ORDER are returned.
> When slop is with 2 greater – ranging search words, as well as, in 
> between words – then THE ORDER of the searched words does not matter.
>
> Best Regards,
> Ivan
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> __________ NOD32 2996 (20080403) Information __________
>
> This message was checked by NOD32 antivirus system.
> http://www.eset.com
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message