lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Difference between 2.4.1 and 2.9.0 (possible regression?)
Date Fri, 16 Oct 2009 16:02:39 GMT
This looks to have been caused by:

    http://issues.apache.org/jira/browse/LUCENE-1124

Which short circuits all matching if the term is too short relative to
the min similarity.  But I guess something must be wrong w/ the
formula.

I'll reopen that issue & mark fix for 2.9.1.

Mike

On Fri, Oct 16, 2009 at 8:35 AM, stefcl <stefatwork@gmail.com> wrote:
>
> Hello,
>
> We are re currently migrating from 2.4.1 to 2.9.0. We've noticed some
> changes in the results of fuzzy queries.
> We have made this small test case :
>
> ********
> StandardAnalyzer analyzer = new StandardAnalyzer();
>
> Directory index = new RAMDirectory();
> IndexWriter w = new IndexWriter(index, analyzer, true,
> IndexWriter.MaxFieldLength.UNLIMITED);
>
> addDoc(w, "Lucene in Action");
> addDoc(w, "Lucene for Dummies");
>
> addDoc(w, "Giga byte");
>
> addDoc(w, "ManagingGigabytesManagingGigabyte");
> addDoc(w, "ManagingGigabytesManagingGigabytes");
>
> addDoc(w, "The Art of Computer Science");
> addDoc(w, "J. K. Rowling");
> addDoc(w, "JK Rowling");
> addDoc(w, "Joanne K Roling");
> addDoc(w, "Bruce Willis");
> addDoc(w, "Willis bruce");
> addDoc(w, "Brute willis");
> addDoc(w, "B. willis");
> w.close();
> ***************
>
> Here's the problem :
> We would expect the query
> Query q = new QueryParser("title", analyzer).parse( "giga~0.9" );
>
> to match at least "Giga byte".
>
> With lucene version 2.4.1 it returns :
> 1. Giga byte with score : 1.7948763
>
> With 2.9, there's no matches, we have to go something as low as 0.7
> ("giga~0.7") to get some matches.
>
> Could this be a regression?
>
>
> http://www.nabble.com/file/p25924689/FirstShot.java Simple test case (1 file
> here)
>
>
>
>
> --
> View this message in context: http://www.nabble.com/Difference-between-2.4.1-and-2.9.0-%28possible-regression-%29-tp25924689p25924689.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message