lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Collins <danwcoll...@gmail.com>
Subject Re: Solr Sharding Strategy
Date Mon, 11 Apr 2016 15:12:06 GMT
I'd also ask about your indexing times, what QTime do you see for indexing
(in both scenarios), and what commit times are you using (which Toke
already asked).

Not entirely sure how to read your table, but looking at the indexing side
of things, with 2 shards, there is inherently more work to do, so you would
expect indexing latency to increase (we have to index in 1 shard, and then
index in the 2nd shard, so logically its twice the workload).

Your table suggests you managed 10 updates per second, but you never
managed 25 updates per second either with 1 shard or 2 shards.  Though the
numbers don't make sense, you managed 13.9 updates per sec on 1 shard, and
21.9 updates per sec on 2 shards.  That suggests to me that in the single
shard case, your searches are causing your indexing to throttle, maybe the
resourcing is favoring searches and so the indexing threads aren't getting
a look in...  Whereas in the 2 shard case, it seems clear (as Toke said),
that search isn't really hitting the index much, not sure where the
bottleneck is, but its not on the index, which is why your indexing load
can get more requests through.

On 11 April 2016 at 15:36, Toke Eskildsen <te@statsbiblioteket.dk> wrote:

> On Mon, 2016-04-11 at 11:23 +0000, Bhaumik Joshi wrote:
> > We are using solr 5.2.0 and we have Index-heavy (100 index updates per
> > sec) and Query-heavy (100 queries per sec) scenario.
>
> > Index stats: 10 million documents and 16 GB index size
>
> > Which sharding strategy is best suited in above scenario?
>
> Sharding reduces query throughput and can improve query latency as well
> as indexing speed. For small indexes, the overhead of sharding is likely
> to worsen query latency. So as always, it depends.
>
> Qualified guess: Don't use multiple shards, but consider using replicas.
>
> > Please share reference resources which states detailed comparison of
> > single shard over multi shard if any.
>
> Sorry, could not find the one I had in mind.
> >
> > Meanwhile we did some tests with SolrMeter (Standalone java tool for
> > stress tests with Solr) for single shard and two shards.
> >
> > Index stats of test solr cloud: 0.7 million documents and 1 GB index
> > size.
> >
> > As observed in test average query time with 2 shards is much higher
> > than single shard.
>
> Makes sense: Your shards are so small that the actual time spend on the
> queries is very low. So relatively, the overhead of distributed (aka
> multi-shard) searching is high, negating any search-gain you got by
> sharding. I would not have expected the performance drop-off to be that
> large (factor 20-60) though.
>
> Your query speed is unusually low for an index of your size, which leads
> me to believe that your indexing is slowing everything down. This is
> often due to too frequent commits and/or too many warm up queries.
>
> There is a bit about it at
> https://wiki.apache.org/solr/SolrPerformanceFactors
>
>
> - Toke Eskildsen, State and University Library, Denmark
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message