lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Autocompletion on large index
Date Wed, 06 Jul 2011 16:23:56 GMT
You could try storing your autocomplete index in a RAMDirectory?

But: I'm surprised you see the FST suggest impl using up so much RAM;
very low memory usage is one of the strengths of the FST approach.
Can you share the text (titles) you are feeding to the suggest module?

Mike McCandless

http://blog.mikemccandless.com

On Wed, Jul 6, 2011 at 12:08 PM, Elmer <evanchastelet@gmail.com> wrote:
> Hi again.
>
> I have created my own autocompleter based on the spellchecker. This
> works well in a sense that it is able to create an auto completion index
> from my 'publication' index. However, integrated in my web application,
> each keypress asks autocompleter to search the index, which is stored on
> disk (not in mem), just like spellchecker does (except that spellchecker
> is not invoked every keypress).
> With Lucene 3.3.0, auto completion modules are included, which load
> their trees/fsa/... in memory. I'd like to use these modules, but the
> problem is that they use more than 2.5GB, causing heap space exceptions.
> This happens when I try to build a LookUp index (fst,jaspell or tst,
> doesn't matter) from my 'publication' index consisting of 1.3M
> publications. The field I use for autocompletion holds the titles of the
> publications indexed untokenized (but lowercased).
>
> Code:
> Lookup autoCompleter = new TSTLookup();
> FSDirectory dir = FSDirectory.open(new File("PATHTOINDEX"));
> LuceneDictionary dict = new
> LuceneDictionary(IndexReader.open(dir),"title_suggest");
> autoCompleter.build(dict);
>
> Is it possible to have the autocompletion module to work in-memory on
> such a dataset without increasing java's heapspace?
> FTR, the 3.3.0 autocompletion modules use more than 2.5GB of RAM, where
> my own autocompleter index is stored on disk using about 300MB.
>
> BR,
> Elmer
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message