lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3846) Fuzzy suggester
Date Fri, 12 Oct 2012 15:35:04 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475077#comment-13475077
] 

Simon Willnauer commented on LUCENE-3846:
-----------------------------------------

here is another benchmark with minPrefix=3 instead of 1 this looks much better: 

{noformat}
-- prefixes: 6-9, num: 7, onlyMorePopular: true
FuzzySuggester  queries: 50001, time[ms]: 2125 [+- 6.38], ~kQPS: 24
AnalyzingSuggester queries: 50001, time[ms]: 452 [+- 3.61], ~kQPS: 111
JaspellLookup   queries: 50001, time[ms]: 187 [+- 1.02], ~kQPS: 267
TSTLookup       queries: 50001, time[ms]: 263 [+- 1.78], ~kQPS: 190
FSTCompletionLookup queries: 50001, time[ms]: 269 [+- 1.59], ~kQPS: 186
WFSTCompletionLookup queries: 50001, time[ms]: 121 [+- 0.75], ~kQPS: 414
-- prefixes: 100-200, num: 7, onlyMorePopular: true
FuzzySuggester  queries: 50001, time[ms]: 2778 [+- 16.56], ~kQPS: 18
AnalyzingSuggester queries: 50001, time[ms]: 414 [+- 1.70], ~kQPS: 121
JaspellLookup   queries: 50001, time[ms]: 133 [+- 1.85], ~kQPS: 376
TSTLookup       queries: 50001, time[ms]: 69 [+- 3.41], ~kQPS: 724
FSTCompletionLookup queries: 50001, time[ms]: 257 [+- 1.79], ~kQPS: 194
WFSTCompletionLookup queries: 50001, time[ms]: 83 [+- 3.31], ~kQPS: 605
-- prefixes: 2-4, num: 7, onlyMorePopular: true
FuzzySuggester  queries: 50001, time[ms]: 1310 [+- 3.30], ~kQPS: 38
AnalyzingSuggester queries: 50001, time[ms]: 995 [+- 8.03], ~kQPS: 50
JaspellLookup   queries: 50001, time[ms]: 507 [+- 4.19], ~kQPS: 99
TSTLookup       queries: 50001, time[ms]: 2148 [+- 16.63], ~kQPS: 23
FSTCompletionLookup queries: 50001, time[ms]: 223 [+- 2.14], ~kQPS: 224
WFSTCompletionLookup queries: 50001, time[ms]: 414 [+- 28.44], ~kQPS: 121
{noformat}


                
> Fuzzy suggester
> ---------------
>
>                 Key: LUCENE-3846
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3846
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.1
>
>         Attachments: LUCENE-3846_fuzzy_analyzing.patch, LUCENE-3846.patch, LUCENE-3846.patch,
LUCENE-3846.patch, LUCENE-3846.patch, LUCENE-3846.patch
>
>
> Would be nice to have a suggester that can handle some fuzziness (like spell correction)
so that it's able to suggest completions that are "near" what you typed.
> As a first go at this, I implemented 1T (ie up to 1 edit, including a transposition),
except the first letter must be correct.
> But there is a penalty, ie, the "corrected" suggestion needs to have a much higher freq
than the "exact match" suggestion before it can compete.
> Still tons of nocommits, and somehow we should merge this / make it work with analyzing
suggester too (LUCENE-3842).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message