lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <>
Subject RE: OutOfMemoryError
Date Wed, 19 Oct 2011 06:23:43 GMT

> ...I get around 3
> million hits. Each of the hits is processed and information from a certain field is
> used.

Thats of course fine, but:

> After certain number of hits, somewhere around 1 million (not always the same
> number) I get OutOfMemory exception that looks like this:

You did not tell us *how* you get the hits. If you do something like,
1000000) that it can easily memory overflow (sooner or later, maybe on decompressing results
maybe somewhere else). Lucene always collects "top-ranking" results and for doing that it
uses a priority queue. With the above command (passing 1 million or more as number of top-ranking
results, this will use insane amounts of memory). Like most full text search engines, Lucene
is optimized for quickly getting the best results. The use-case of fetching *all* possible
hits is not really the correct use case of a full text search engine (especially as hits that
far at the end are in most cases no more relevant to your query).

To really collect all hits (but in arbitrary order, not sorted by relevance), write your own
Collector implementation that collects the results and pass it to searcher. There are several
code sample on this mailing list.

Another approach is to use the new "sortAfter" method, available in the next Lucene version
(not yet released).


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message