lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toke Eskildsen>
Subject Re: Scaling out/up or a mix
Date Wed, 01 Jul 2009 11:31:20 GMT
On Tue, 2009-06-30 at 22:59 +0200, Marcus Herou wrote:
> The number of concurrent users today is insignficant but once we push
> for the service we will get into trouble... I know that since even one
> simple faceting query (which we will use to display trend graphs) can
> take forever (talking about SOLR bytw).

Ah, faceting. That could very well be the defining requirement for your
selection of hardware. As far as I remember, Solr supports two different
ways of faceting (depending on whether there are few or many tags in a
facet), where at least one of them uses counters corresponding to the
number of documents in the index. That scheme is similar to the approach
we're taking and in our experience this quickly moves the bottleneck to
RAM access speed. Now, I'm not at all a Solr expert so they might have
done something clever in that area; I'd recommend that you also state
your question on the Solr mailing list and mention what kind of faceting
you'll be performing.

> "Normal" Lucene queries (title:blah OR description:blah) timing is
> reasonable for the current hardware but not good (Currently 8 machines
> 2GB RAM each serving 130G index). It takes less than 10 secs at all
> times which of course is very bad user experience.

In your first post you stated "I currently have an index which is 16GB
per machine (8 machines = 128GB)" so it's a bit confusing to me what you

> Example of a public query (no sorting on publisheddate but rather on
> relevance = faster):

I tried a few searches and that seemed snappy enough. Not anywhere near
10 seconds?

> Sorry not meaning to advertise but I could not help it :) 

No problem. The BlogSpace thingy was nice eye-candy.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message