lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kent Mu <>
Subject Re: solrcloud consumes more time than solr when write index
Date Wed, 13 Jul 2016 01:25:21 GMT
Dear Mr. Wartes,
Thanks for your reply. well, I see. for solr we do have replicas, and for
solrcloud, we have 5 shards and each shards with one leader and one
replica. and the data number is nearly 100 million, you mean we do not need
to optimize the index data?


2016-07-12 23:02 GMT+08:00 Jeff Wartes <>:

> Well, two thoughts:
> 1. If you’re not using solrcloud, presumably you don’t have any replicas.
> If you are, presumably you do. This makes for a biased comparison, because
> SolrCloud won’t acknowledge a write until it’s been safely written to all
> replicas. In short, solrcloud write time is max(per-replica write time).
> The more replicas you add, the bigger the chance some replica randomly
> takes longer (gc pause, perhaps?), and the longer your overall write time,
> assuming a fixed number of indexing threads.
> 2. The parallelism of the optimize operation across replicas has gone back
> and forth a bit, and I’m not sure what it was doing in 4.9. However, at one
> point the optimize happened per-replica, serially. So it’d do
> shard1_replica1, then when that was done, do shard1_replica2, then
> shard2_replica1, etc. Other versions of Solr would do those at the same
> time. Again, I don’t know if you’re comparing to a non-replicated solr
> index, but that could explain some of the difference.
> There’s a sort of an obligatory comment at this point that optimize
> doesn’t necessarily save you a lot. There are certainly cases where it
> does, but if you haven’t already, you’ll want to validate that you have one
> of them and that you’re not just doing unnecessary work.
> On 7/12/16, 7:41 AM, "Kent Mu" <> wrote:
> >hello, does anybody also come across the issue? can anybody help me?
> >
> >2016-07-11 23:17 GMT+08:00 Kent Mu <>:
> >
> >> Hi friends!
> >>
> >> solr version: 4.9.0.
> >>
> >> we use solr and solrcloud in our project, that means we use sorl and
> >> solrcloud at the same time.
> >> but we find a phenomenon that sorlcoud consumes more time than solr when
> >> write index. it takes nearly 5 or more times longer. I wonder that is
> why?
> >>
> >> in our project, we have a scheduler job to add index, and then execute
> the
> >> the method of "optimize(false, true, 2)" to optimize the added index.
> >> I wonder if it is caused by solrcloud internal that when writing index,
> >> solrcloud needs to just which shard it should be stored? and when
> >> optimizing the replicate needs to take some time to synchronize the data
> >> from leader?
> >>
> >> and I wonder what about query?  will solrcloud also take more time than
> >> solr when query data?
> >>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message