lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Utkarsh Sengar <utkarsh2...@gmail.com>
Subject Re: Solr4 cluster setup for high performance reads
Date Fri, 21 Jun 2013 18:04:41 GMT
Thanks for the update guys, I am working on the suggestions shared by you.
One last question about the solrcloud setup.

What is the recommended cluster size for solrcloud? I have 3 nodes of solr
and 3 nodes of ZK (running on the same machine, but a different JVM).
And after 2-3 days I notice that zk returns one node is down, but
everything is fine on that machine.
And then I get this error when I query any node: "no servers hosting shard:
solr".

This has definitiely has to do with my setup, even if one node goes down,
the whole cluster should not start barfing.

Suggestions?

Thanks,
-Utkarsh


On Thu, Jun 13, 2013 at 7:28 PM, Shawn Heisey <solr@elyograg.org> wrote:

> On 6/13/2013 7:51 PM, Utkarsh Sengar wrote:
> > Sure, I will reduce the count and see how it goes. The problem I have is,
> > after such a change, I need to reindex everything again, which again is
> > slow and takes time (40-60hours).
>
> There should be no need to reindex after changing most things in
> solrconfig.xml.  Changing cache sizes does not require it.  Most of the
> time, reindexing is only required after changing schema.xml, but there
> are a few changes you can make to schema that don't require it.
>
> > Some queries are really bad, like this one:
> > http://explain.solr.pl/explains/bzy034qi
> > How can this be improved? I understand that there is something horribly
> > wrong here, but not sure what points to look at (Been using solr from the
> > last 20 days).
>
> You are using a *LOT* of query clauses against your allText field in
> that boost query.  I assume that allText is your largest field.  I'm not
> really sure, but based on what we're seeing here, I bet that a bq
> parameter doesn't get cached.  With some additional RAM available, this
> might not be such a big problem.
>
> > The query is simple, although it used edismax. I have shared an explain
> > query above. Other than the query, this is my performance stats:
> >
> > iostat -m 5 result: http://apaste.info/hjNV
> >
> > top result: http://apaste.info/jlHN
>
> You've got a pretty well-sustained iowait around ten percent.  You are
> I/O bound.  You need more total RAM.  With indexing only happening once
> a day, that doesn't sound like it's a factor.  If you are also having
> problems with garbage collection because your heap is a little bit too
> small, that makes all the other problems worse.
>
> > For the initial training, I will hit solr 1.3M times and request 2000
> > documents in each query. By the current speed (just one machine), it will
> > take me ~20 days to do the initial training.
>
> This is really mystifying.  There is no need to send a million plus
> queries to warm your index.  A few dozen or a few hundred queries should
> be all you need, and you don't need 2000 docs returned per query.  Go
> with ten rows, or maybe a few dozen rows at most.  Because you're using
> SSD, I'm not sure you need warming queries at all.
>
> Thanks,
> Shawn
>
>


-- 
Thanks,
-Utkarsh

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message