lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <>
Subject Re: Performance problems with Lucene 2.9
Date Mon, 30 Nov 2009 15:56:10 GMT

First you can use MatchAllDocsQuery, which matches all documents. It will
save a HUGE posting list (TAG:TAG), and performs much faster. For example
TAG:TAG computes a score for each doc, even though you don't need it.
MatchAllDocsQuery doesn't.

Second, move away from Hits ! :) Use Collectors instead.

If I understand the chain of filters, do you think you can code them with a
BooleanQuery that is added BooleanClauses, each with is Term (field:value)?
You can add clauses w/ OR, AND, NOT etc.

Note that in Lucene 2.9, you can avoid scoring documents very easily, which
is a performance win if you don't need scores (i.e. if you just want to
match everything, not caring for scores).


On Mon, Nov 30, 2009 at 5:47 PM, Michel Nadeau <> wrote:

> Hi,
> we use Lucene to store around 300 millions of records. We use the index
> both
> for conventional searching, but also for all the system's data - we
> replaced
> MySQL with Lucene because it was simply not working at all with MySQL due
> to
> the amount or records. Our problem is that we have HUGE performance
> problems... whenever we search, it takes forever to return results, and
> Java
> uses 100% CPU/RAM.
> Our index fields are like this:
> PK
> ...other information depending on type...
> * All fields are Field.Index.UN_TOKENIZED
> * The field "TAG" always contains the value "TAG".
> Whenever we search in the index, our query is "TAG:TAG" to match all
> documents, and we do the search like this:
>        // Search
>        Hits h =, cluCF, cluSort);
> cluCF is a ChainedFilter containing all the other filters (like
> FOREIGN_PK=12345, TYPE=a, etc.).
> I know that the method is probably crazy because "TAG:TAG" is matching all
> 300M documents and then it applies filters; so that's probably why every
> little query is taking 100% CPU/RAM.... but I don't know how to do it
> properly.
> Help ! Any advice is welcome.
> - Mike

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message