lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stefcl <stefatw...@gmail.com>
Subject Difference between 2.4.1 and 2.9.0 (possible regression?)
Date Fri, 16 Oct 2009 12:35:53 GMT

Hello,

We are re currently migrating from 2.4.1 to 2.9.0. We've noticed some
changes in the results of fuzzy queries.
We have made this small test case :

********
StandardAnalyzer analyzer = new StandardAnalyzer();

Directory index = new RAMDirectory();
IndexWriter w = new IndexWriter(index, analyzer, true,
IndexWriter.MaxFieldLength.UNLIMITED);

addDoc(w, "Lucene in Action");
addDoc(w, "Lucene for Dummies");

addDoc(w, "Giga byte");

addDoc(w, "ManagingGigabytesManagingGigabyte");
addDoc(w, "ManagingGigabytesManagingGigabytes");

addDoc(w, "The Art of Computer Science");
addDoc(w, "J. K. Rowling");
addDoc(w, "JK Rowling");
addDoc(w, "Joanne K Roling");
addDoc(w, "Bruce Willis");
addDoc(w, "Willis bruce");
addDoc(w, "Brute willis");
addDoc(w, "B. willis");
w.close();
***************

Here's the problem :
We would expect the query 
Query q = new QueryParser("title", analyzer).parse( "giga~0.9" );

to match at least "Giga byte".

With lucene version 2.4.1 it returns :
1. Giga byte with score : 1.7948763

With 2.9, there's no matches, we have to go something as low as 0.7
("giga~0.7") to get some matches.

Could this be a regression?


http://www.nabble.com/file/p25924689/FirstShot.java Simple test case (1 file
here) 



 
-- 
View this message in context: http://www.nabble.com/Difference-between-2.4.1-and-2.9.0-%28possible-regression-%29-tp25924689p25924689.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message