lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Wang (JIRA)" <>
Subject [jira] Commented: (SOLR-1850) KeepWordFilter can be slow at query time if wordlist is large
Date Fri, 26 Mar 2010 22:10:27 GMT


John Wang commented on SOLR-1850:

Hi Yonk:

     No problem! Do you think overloading the constructor is the right thing to do here?


> KeepWordFilter can be slow at query time if wordlist is large
> -------------------------------------------------------------
>                 Key: SOLR-1850
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: John Wang
> In the case when "Set<String> words" is large, constructing a KeepWordFilter at
query time is very costly because of the construction (copy) of the set, e.g.:
>     this.words = new CharArraySet(words, ignoreCase);
> This call does an addAll on the set, and is done for each query, and is the same work.
> Suggestion: overload the constructor and expose the CharArraySet, e.g.:
>   public KeepWordFilter(TokenStream in, CharArraySet words ) {
>     super(in);
>     this.words = words;
>     this.termAtt = (TermAttribute)addAttribute(TermAttribute.class);
>   }
> This allows the ability to have CharArraySet to be constructed once staticly for the
application instead at query time.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message