lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Condit <con...@sdsc.edu>
Subject RE: Best practices for searcher memory usage?
Date Wed, 14 Jul 2010 18:28:06 GMT
Hi Toke-
> > * 20 million documents [...]
> > * 140GB total index size
> > * Optimized into a single segment
> 
> I take it that you do not have frequent updates? Have you tried to see if you
> can get by with more segments without significant slowdown?

Correct - in fact there are no updates and no deletions. We index everything offline when
necessary and just swap the new index in...
By more segments do you mean not call optimize() at index time?

> > The application will run with 10G of -Xmx but any less and it bails out.
> > It seems happier if we feed it 12GB. The searches are starting to bog
> > down a bit (5-10 seconds for some queries)...
> 
> 10G sounds like a lot for that index. Two common memory-eaters are sorting
> by field value and faceting. Could you describe what you're doing in that
> regard?

No faceting and no sorting (other than score) for this index...

> Similarly, the 5-10 seconds for some queries seems very slow. Could you give
> some examples on the queries that causes problems together with some
> examples of fast queries and how long they take to execute?

Typically just TermQueries or BooleanQueries: (Chip OR Nacho OR Foo) AND (Salsa OR Sauce)
AND (This OR That)
The latter is most typical.

With a single keyword it will execute in < 1 second. In a case where there are 10 clauses
it becomes much slower (which I understand, just looking for ways to speed it up)...

Thanks,
-Chris
Mime
View raw message