lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis.gospodne...@gmail.com>
Subject Re: indexing cpu utilization
Date Thu, 03 Jan 2013 04:20:48 GMT
I, too, was going to point out to the number of threads, but was going to
suggest using fewer of them because the server has 32 cores and there was a
mention of 100 threads being used from the client.  Thus, my guess was that
the machine is busy juggling threads and context switching (how's vmstat 2
output, Uwe?) instead of doing the real work.

Mark wanted to point this other issue:
https://issues.apache.org/jira/browse/SOLR-3929 though, so try that, too.

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Wed, Jan 2, 2013 at 11:13 PM, Gora Mohanty <gora@mimirtech.com> wrote:

> On 3 January 2013 05:55, Mark Miller <markrmiller@gmail.com> wrote:
> >
> > 32 cores eh? You probably have to raise some limits to take advantage of
> > that.
> >
> > https://issues.apache.org/jira/browse/SOLR-4078
> > support configuring IndexWriter max thread count in solrconfig
> >
> > That's coming in 4.1 and is likely important - the default is only 8.
> >
> > You might always want to experiment with using more merge threads? I
> think
> > the default may be 3.
> >
> > Beyond that, you may want to look at running multiple jvms on the one
> host
> > and doing distributed. That can certainly have benefits, but you have to
> > weigh against the management costs. And make sure process->processor
> > affinity is in gear.
> >
> > Finally, make sure you are using many threads to add docs...
> [...]
>
> Yes, making sure to use many threads is definitely good.
> We also found that indexing to multiple Solr cores, and
> doing one merge of all the indices at the end dramatically
> improved indexing time. As long as we had roughly one
> CPU core per Solr core (I am guessing that had to do
> with threading) indexing speed increased linearly with the
> number of Solr cores. Yes, the merge at the end is slow,
> and needs large disk space (at least twice the total index
> size), but one wins overall.
>
> Regards,
> Gora
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message