lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Grant Ingersoll (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-2479) need the ability to also sort SpellCheck results by freq, instead of just by Edit Distance+freq
Date Tue, 17 Aug 2010 20:31:18 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Grant Ingersoll updated LUCENE-2479:
------------------------------------

    Attachment: LUCENE-2479.patch

Patch that implements the comparator approach.  I didn't incorporate the freq into the scoring
b/c this would mean having to look up the freq. for every suggestion, which I think would
be pretty bad performance-wise.

I also refactored the Solr SpellCheckComponent a little bit to not have a copy and paste of
the SuggestWord* classes.  I intend to commit today or tomorrow.  All tests pass and it is
back compatible.  I will also port back to 3.x

> need the ability to also sort SpellCheck results by freq, instead of just by Edit Distance+freq
> -----------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2479
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2479
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/spellchecker
>         Environment: all
>            Reporter: Gerald DeConto
>            Assignee: Grant Ingersoll
>         Attachments: LUCENE-2479.patch
>
>
> This issue was first noticed and reported in this Solr thread; http://lucene.472066.n3.nabble.com/spellcheck-issues-td489776.html#a489788
> Basically, there are situations where it would be useful to sort by freq first, instead
of the current "sort by edit distance, and then subsort by freq if edit distance is equal"
> The author of the thread suggested "What I think would work even better than allowing
a custom compareTo function would be to incorporate the frequency directly into the distance
function.  This would allow for greater control over the trade-off between frequency and edit
distance"
> However, custom compareTo functions are not always be possible (ie if a certain version
of Lucene must be used, because it was release with Solr) and incorporating freq directly
into the distance function may be overkill (ie depending on the implementation)
> it is suggested that we have a simple modification of the existing compareTo function
in Lucene to allow users to specify if they want the existing sort method or if they want
to sort by freq.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message