lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3233) HuperDuperSynonymsFilterâ„¢
Date Sat, 09 Jul 2011 13:35:16 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062375#comment-13062375
] 

Michael McCandless commented on LUCENE-3233:
--------------------------------------------

I think this is ready to commit, but I'd like to rename existing syn filter to SlowSynonymFilter
and rename the new one to SynonymFilter.

Because there are some minor diffs (deduping rules, lowercasing), for Solr to cutover I think
we need some back compat logic; I'll open a separate issue for this.

> HuperDuperSynonymsFilterâ„¢
> -------------------------
>
>                 Key: LUCENE-3233
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3233
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>         Attachments: LUCENE-3223.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch,
LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch,
LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch,
synonyms.zip
>
>
> The current synonymsfilter uses a lot of ram and cpu, especially at build time.
> I think yesterday I heard about "huge synonyms files" three times.
> So, I think we should use an FST-based structure, sharing the inputs and outputs.
> And we should be more efficient with the tokenStream api, e.g. using save/restoreState
instead of cloneAttributes()

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message