lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Marr (JIRA)" <j...@apache.org>
Subject [jira] Created: (LUCENE-1690) Morelikethis queries are very slow compared to other search types
Date Sat, 13 Jun 2009 05:37:07 GMT
Morelikethis queries are very slow compared to other search types
-----------------------------------------------------------------

                 Key: LUCENE-1690
                 URL: https://issues.apache.org/jira/browse/LUCENE-1690
             Project: Lucene - Java
          Issue Type: Improvement
          Components: contrib/*
    Affects Versions: 2.4.1
            Reporter: Richard Marr
            Priority: Minor


The MoreLikeThis object performs term frequency lookups for every query.  From my testing
that's what seems to take up the majority of time for MoreLikeThis searches.  

For some (I'd venture many) applications it's not necessary for term statistics to be looked
up every time. A fairly naive opt-in caching mechanism tied to the life of the MoreLikeThis
object would allow applications to cache term statistics for the duration that suits them.

I've got this working in my test code. I'll put together a patch file when I get a minute.
From my testing this can improve performance by a factor of around 10.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message