lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shalin Shekhar Mangar <shalinman...@gmail.com>
Subject Re: How to verify a document is indexed by all replicas
Date Tue, 24 Mar 2015 16:39:45 GMT
Hi Shai,

To your original question on how to know if a document has been indexed at
all replicas -- You can add a min_rf=true parameter to your indexing
request and then Solr will add information to the response about how many
replicas gave an ack' to the leader. So if the returned number is equal to
the number of replicas, you can be sure that the doc has been indexed
everywhere.

More comments inline:

On Tue, Mar 24, 2015 at 8:18 AM, Shai Erera <serera@gmail.com> wrote:

> Thanks Erick,
>
> When a replica is down, no updates are sent to it. When it comes back up,
> it discovers that it needs to catch-up with the leader. If there are many
> events it falls back to index replication (slower). During this period of
> time, is the replica considered ACTIVE or RECOVERING?
>
>
It is marked as recovering.


> And, can I assume that at any given moment (aside from ZK connection
> timeouts etc.) when I check the replicas' state, all the ones that report
> ACTIVE are in sync with each other?
>
>
Yes, 'active' replicas should be in sync but autoCommits can cause
inconsistency between replicas as to what is visible to searchers (even if
all replicas have indexed the same data). Also, checking the state of the
replica is not enough, one should always check for the state=active and
live-ness of the replica i.e. the node is marked live under /live_nodes in
ZK.


> Shai
>
> On Tue, Mar 24, 2015 at 5:04 PM, Erick Erickson <erickerickson@gmail.com>
> wrote:
>
> > You can always issue a *:* query, but it'd have to be at least your
> > autoSoftCommit interval ago since the soft commit trigger will have
> > slightly different wall clock times.
> >
> > But it shouldn't be necessary to wait I don't think. Since the
> > indexing request doesn't succeed until the docs have been written to
> > the tlogs, and since the tlogs will be replayed in the event of a
> > problem your data should be fine. Of course if you're indexing at a
> > very fast rate and your tlog is huge, it'll take a while....
> >
> > FWIW,
> > Erick
> >
> > On Tue, Mar 24, 2015 at 4:59 AM, Shai Erera <serera@gmail.com> wrote:
> > > Hi
> > >
> > > Is there a recommended, preferably fast, way to check that a document
> is
> > > indexed by all replicas? I currently do that by issuing a search
> request
> > to
> > > each replica, but was wondering if there's a faster way.
> > >
> > > Even better, is there a way to verify all replicas of a shard are
> > > "up-to-date", e.g. by comparing their version or something? By
> > "up-to-date"
> > > I mean that they've all processed the same update requests that came
> > > through.
> > >
> > > If there's a replica lagging behind, I'd like to wait for it to catch
> up,
> > > something like a checkpoint(), before I continue sending more updates.
> > >
> > > Shai
> >
>



-- 
Regards,
Shalin Shekhar Mangar.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message