lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3801) Generify FST shortestPaths() to take a comparator
Date Fri, 02 Mar 2012 13:10:58 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220902#comment-13220902
] 

Robert Muir commented on LUCENE-3801:
-------------------------------------

I think this patch is ready to commit, but the tricky FST math there (compare + add) does
add some additional cost.

Still, I think 180,000QPS versus 210,000QPS or whatever, who cares. Being able to separately
have weights and outputs and
do shortest path operations on just the weight side (with any Outputs representation) is really
powerful and I think we can
use it to improve our suggesters.

Benchmarks are all in QPS with a top-N of 7 suggestions (50,000 inputs)

||impl||prefixes 2-4||prefixes 6-9||prefixes 100-200||
|Jaspell|129,000|330,000|436,000|
|TST|31,000|258,000|820,000|
|FST|330,000|263,000|269,000|
|WFST|209,000|606,000|781,000|
|WFST-Generic|179,000|521,000|708,000|

I'll wait a bit in case someone wants to review or knows of ways to speed up the patch :)

                
> Generify FST shortestPaths() to take a comparator
> -------------------------------------------------
>
>                 Key: LUCENE-3801
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3801
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/FSTs
>    Affects Versions: 3.6, 4.0
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>         Attachments: LUCENE-3801.patch, LUCENE-3801.patch, LUCENE-3801.patch
>
>
> Not sure we should do this, it costs 5-10% performance for WFSTSuggester.
> But maybe we can optimize something here, or maybe its just no big deal to us.
> Because in general, this could be pretty powerful, e.g. if you needed to store 
> some custom stuff in the suggester, you could use pairoutputs, or whatever.
> And the possibility we might need shortestPaths for other cool things... at the
> least I just wanted to have the patch up here.
> I haven't tested this on pairoutputs... but i've tested it with e.g. FloatOutputs
> and other things and it works fine.
> I tried to minimize the generics violations, there is only 1 (cannot create generic array).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message