lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kent Mu <solr.st...@gmail.com>
Subject Re: solrcloud consumes more time than solr when write index
Date Wed, 13 Jul 2016 01:25:21 GMT
Dear Mr. Wartes,
Thanks for your reply. well, I see. for solr we do have replicas, and for
solrcloud, we have 5 shards and each shards with one leader and one
replica. and the data number is nearly 100 million, you mean we do not need
to optimize the index data?

Thanks!
Kent

2016-07-12 23:02 GMT+08:00 Jeff Wartes <jwartes@whitepages.com>:

> Well, two thoughts:
>
>
> 1. If you’re not using solrcloud, presumably you don’t have any replicas.
> If you are, presumably you do. This makes for a biased comparison, because
> SolrCloud won’t acknowledge a write until it’s been safely written to all
> replicas. In short, solrcloud write time is max(per-replica write time).
> The more replicas you add, the bigger the chance some replica randomly
> takes longer (gc pause, perhaps?), and the longer your overall write time,
> assuming a fixed number of indexing threads.
> 2. The parallelism of the optimize operation across replicas has gone back
> and forth a bit, and I’m not sure what it was doing in 4.9. However, at one
> point the optimize happened per-replica, serially. So it’d do
> shard1_replica1, then when that was done, do shard1_replica2, then
> shard2_replica1, etc. Other versions of Solr would do those at the same
> time. Again, I don’t know if you’re comparing to a non-replicated solr
> index, but that could explain some of the difference.
>
> There’s a sort of an obligatory comment at this point that optimize
> doesn’t necessarily save you a lot. There are certainly cases where it
> does, but if you haven’t already, you’ll want to validate that you have one
> of them and that you’re not just doing unnecessary work.
>
>
> On 7/12/16, 7:41 AM, "Kent Mu" <solr.study@gmail.com> wrote:
>
> >hello, does anybody also come across the issue? can anybody help me?
> >
> >2016-07-11 23:17 GMT+08:00 Kent Mu <solr.study@gmail.com>:
> >
> >> Hi friends!
> >>
> >> solr version: 4.9.0.
> >>
> >> we use solr and solrcloud in our project, that means we use sorl and
> >> solrcloud at the same time.
> >> but we find a phenomenon that sorlcoud consumes more time than solr when
> >> write index. it takes nearly 5 or more times longer. I wonder that is
> why?
> >>
> >> in our project, we have a scheduler job to add index, and then execute
> the
> >> the method of "optimize(false, true, 2)" to optimize the added index.
> >> I wonder if it is caused by solrcloud internal that when writing index,
> >> solrcloud needs to just which shard it should be stored? and when
> >> optimizing the replicate needs to take some time to synchronize the data
> >> from leader?
> >>
> >> and I wonder what about query?  will solrcloud also take more time than
> >> solr when query data?
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message