lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Ribeiro <edward.ribe...@gmail.com>
Subject Re: Soft commit and new replica types
Date Fri, 14 Dec 2018 17:30:55 GMT
Indeed! It clarified a lot, thank you. :) Now I know I messed with the
reload core config, but the other aspects were more or less what I have
been expecting.

Do you think it's worth to submit a PR to the Reference Guide with those
explanations? I can take a stab at it.

Regards,
Edward

On Fri, Dec 14, 2018 at 3:08 AM Tomás Fernández Löbbe <tomasflobbe@gmail.com>
wrote:

> > >
> > > No, I am not seeing reloads.
>
> Ah, good.
>
>
> > > I am trying to understand the interactions
> > > between hard commit, soft commit, transaction log update with a TLOG
> > > cluster for both leader and follower replicas. For example, after
> getting
> > > new segments from the leader the follower replica will still apply the
> > > hard/soft commit?
> >
>
> Think about the hard commit as a flush of the latest updates to a segment
> plus checkpoint pointing to all the current valid segments. That checkpoint
> is also a file. The soft commit is similar to the hard commit in the sense
> that it creates a segment and a pointer to the valid segments, however,
> those segments may not be flushed to disk yet, and the checkpoint is not on
> a file. *In addition* to creating segments, the commits in Solr create
> searchers to get the latest view of the index (hard-commits only when
> openSearcher=true and soft-commits always), but that doesn't really matter
> in the context of replication.
>
> The follower replica (a TLOG/PULL) will ask the leader for the last hard
> commit and replicate all the segments and the file indicating the commit.
> All the TLOG/PULL replica does after it replicates is open a searcher with
> all the segments in that checkpoint. Two important notes here: 1) the
> follower replica doesn't "perform" a commit, it copied it from the leader
> and 2) this "open a searcher" is not a soft/hard commit, is just opening a
> searcher (a "commit" usually involves creating segments).
>
> * If in the leader (a TLOG replica) you do a soft commit, it'll never make
> it to the follower, because the follower only replicates the latest hard
> commit (see ReplicationHandler.indexCommitPoint).
> * If in the follower (a TLOG replica) you do a soft commit, it won't do any
> difference, because in the TLOG case, documents are not added to the index
> (only to the transaction log). (See UpdateCommand.IGNORE_INDEXWRITER flag)
> * If in the follower (a PULL replica) you do a soft commit, it also
> wouldn't do any difference, because it doesn't receive the documents anyway
> (only replicates). Commit is skipped anyway (see
> DistributedUpdateProcessor.processCommit)
>
> The transaction log is only used for recovery purposes (or realtime get).
>
> I hope that clarifies things.
>
> >
> > > PS: congratulations on the Berlin Buzzwords' talk. :)
> >
> Thanks!
>
> > >
> > > Thanks!
> > >
> > > On Mon, Dec 10, 2018 at 9:24 PM Tomás Fernández Löbbe
> > > <tomasflobbe@gmail.com>
> > > wrote:
> > >
> > > > I think this is a good point. The tricky part is that if TLOG
> replicas
> > > > don't replicate often, their transaction logs will get too big too,
> so
> > you
> > > > want the replication interval of TLOG replicas to be tied to the
> > > > auto(hard)Commit interval (by default at least). If you are using
> them
> > for
> > > > search, you may also not want to open a searcher for each fetch...
> for
> > PULL
> > > > replicas, maybe the best way is to use the autoSoftCommit interval to
> > > > define the polling interval. That said, I'm not sure using different
> > > > configurations is a good idea, some people may be mixing TLOG and
> PULL
> > > and
> > > > querying them both alike.
> > > >
> > > > In the meantime, if you have different hosts for TLOG and PULL
> > replicas,
> > > > one workaround you can have is to define the autoCommit time with a
> > > system
> > > > property, and use different properties for TLOGs vs PULL nodes.
> > > >
> > > > > There is no commit on TLOG/PULL  follower replicas, only on the
> > leader.
> > > > > Followers fetch the segments and **reload the core** every 150
> > seconds
> > > >
> > > > Edward, "reload" shouldn't really happen in regular TLOG/PULL
> fetches.
> > Are
> > > > you seeing reloads?
> > > >
> > > > On Mon, Dec 10, 2018 at 4:41 PM Erick Erickson <
> > erickerickson@gmail.com>
> > > > wrote:
> > > >
> > > > > bq. but not every poll attempt they fetch new segment from the
> leader
> > > > >
> > > > > Ah, right. Ignore my comment. Commit will only occur on the
> followers
> > > > > when there are new segments to pull down, so your'e right, roughly
> > > > > every second poll would commit find things to bring down and open
a
> > > > > new searcher.........
> > > > > On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro
> > > <edward.ribeiro@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > Hi Vadim,
> > > > > >
> > > > > > There is no commit on TLOG/PULL  follower replicas, only on
the
> > leader.
> > > > > > Followers fetch the segments and **reload the core** every 150
> > seconds
> > > > > (if
> > > > > > there were new segments, I suppose). Yeah, followers don't pay
> the
> > CPU
> > > > > > price of indexing, but there are still cache invalidation,
> > autowarming,
> > > > > > etc, in addition to network and IO demand. Is that ritht, Erick?
> > > > > >
> > > > > > Besides that, Erick is pointing out that under a heavy indexing
> > > > workload
> > > > > > you could either have:
> > > > > >
> > > > > > 1. Very large transaction logs;
> > > > > >
> > > > > > 2. Very large numbers of segments. If that is the case, you
could
> > have
> > > > > the
> > > > > > following scenario numerous times:
> > > > > >    2.1. follower replica downloads segment A and B from leader;
> > > > > >    2.2 leader merges segments A + B into C;
> > > > > >    2.3. follower replicas discard A and B and download C on
next
> > poll;
> > > > > >
> > > > > > Under the second condition followers needlessly downloaded
> segments
> > > > that
> > > > > > would eventually be merged.
> > > > > >
> > > > > > IMO, you should carefully evaluate if the use of TLOG/PULL is
> > really
> > > > > > recommended for your cluster setup, plus indexing and querying
> > > > workload.
> > > > > > You can very much stay with a NRT setup if it suits you better.
> The
> > > > > videos
> > > > > > below provide a nice set of hints for when to choose between
NRT
> or
> > > > some
> > > > > > combination of TLOG and PULL.
> > > > > >
> > > > > > https://youtu.be/XIb8X3MwVKc
> > > > > >
> > > > > > https://youtu.be/dkWy2ykzAv0
> > > > > >
> > > > > > https://youtu.be/XqfTjd9KDWU
> > > > > >
> > > > > > Regards,
> > > > > > Edward
> > > > > >
> > > > > > Em dom, 9 de dez de 2018 16:56, <
> vadim.ivanov@spb.ntk-intourist.ru
> > > > > escreveu:
> > > > > >
> > > > > > >
> > > > > > >  If hard commit max time is 300 sec then commit happens
every
> 300
> > > sec
> > > > > on
> > > > > > > tlog leader. And new segments pop up on the leader every
300
> sec,
> > > > > during
> > > > > > > indexing. Polling interval on other replicas 150 sec, but
not
> > every
> > > > > poll
> > > > > > > attempt they fetch new segment from the leader, afaiu.
Erick,
> do
> > you
> > > > > mean
> > > > > > > that on all other  tlog replicas(not leaders) commit occurs
> every
> > > > poll?
> > > > > > > воскресенье, 09 декабря 2018г., 19:21
+03:00 от Erick Erickson
> > > > > > > erickerickson@gmail.com :
> > > > > > >
> > > > > > > >Not quite, 600000. The polling interval is half the
commit
> > > > > interval....
> > > > > > > >
> > > > > > > >This has always bothered me a little bit, I wonder
at the
> > utility
> > > > of a
> > > > > > > >config param. We already have old-style replication
with a
> > > > > > > >configurable polling interval. Under very heavy indexing
> loads,
> > it
> > > > > > > >seems to me that either the tlogs will grow quite large
or
> > we'll be
> > > > > > > >pulling a lot of unnecessary segments across the wire,
> segments
> > > > > > > >that'll soon be merged away and the merged segment
re-pulled.
> > > > > > > >
> > > > > > > >Apparently, though, nobody's seen this "in the wild",
so it's
> > > > > > > >theoretical at this point.
> > > > > > > >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> > > > > > > < vadim.ivanov@spb.ntk-intourist.ru> wrote:
> > > > > > > >
> > > > > > > > Thanks, Edward, for clues.
> > > > > > > > What bothers me is newSearcher start, warming, cache
clear...
> > all
> > > > > that
> > > > > > > CPU consuming stuff in my heavy-indexing scenario.
> > > > > > > > With NRT I had autoSoftCommit:  300000 .
> > > > > > > > So I had new Searcher no more than  every 5 min on
every
> > replica.
> > > > > > > > To have more or less  the same effect with TLOG -
PULL
> > collection,
> > > > > > > > I suppose, I have to have  :  300000
> > > > > > > > (yes, I understand that newSearchers start asynchronously
on
> > leader
> > > > > and
> > > > > > > replicas)
> > > > > > > > Am I right?
> > > > > > > > --
> > > > > > > > Vadim
> > > > > > > >
> > > > > > > >
> > > > > > > >> -----Original Message-----
> > > > > > > >> From: Edward Ribeiro [mailto:edward.ribeiro@gmail.com]
> > > > > > > >> Sent: Sunday, December 09, 2018 12:42 AM
> > > > > > > >> To:  solr-user@lucene.apache.org
> > > > > > > >> Subject: Re: Soft commit and new replica types
> > > > > > > >>
> > > > > > > >> Some insights in the new replica types below:
> > > > > > > >>
> > > > > > > >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > > > > > > >> vadim.ivanov@spb.ntk-intourist.ru wrote:
> > > > > > > >>
> > > > > > > >>>
> > > > > > > >>> From Ref guide we have:
> > > > > > > >>> " NRT is the only type of replica that supports
> > soft-commits..."
> > > > > > > >>> "If TLOG replica does become a leader, it
will behave the
> > same as
> > > > > if it
> > > > > > > >>> was a NRT type of replica."
> > > > > > > >>> Does it mean, that if we do not have NRT replicas
in the
> > cluster
> > > > > then
> > > > > > > >>> autoSoftCommit section in solconfig.xml Ignored
completely
> > (even
> > > > on
> > > > > > > TLOG
> > > > > > > >>> leader)?
> > > > > > > >>>
> > > > > > > >>
> > > > > > > >> No, not completely. Both TLOG and PULL nodes will
> periodically
> > > > poll
> > > > > the
> > > > > > > >> leader for changes in index segments' files and
download
> those
> > > > > segments
> > > > > > > >> from the leader. If hard commit max time is defined
in
> > > > > solrconfig.xml
> > > > > > > the
> > > > > > > >> polling interval of each replica will be half
that value. Or
> > else
> > > > > if the
> > > > > > > >> soft commit max time is defined then the replicas
will use
> > half
> > > > the
> > > > > soft
> > > > > > > >> commit max time as the interval. If neither are
defined then
> > the
> > > > > poll
> > > > > > > >> interval will be 3 seconds (hard coded). See here:
> > > > > > > >> https://github.com/apache/lucene-
> > > > > > > >>
> > > > >
> > > solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > > > > >> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> > > > > > > >>
> > > > > > > >> If the TLOG is the leader it will index locally
and append
> > the doc
> > > > > to
> > > > > > > >> transaction log as a NRT node would do as well
as it will
> > > > > synchronously
> > > > > > > >> replicate the data to other TLOG replicas' transaction
logs
> > (PULL
> > > > > nodes
> > > > > > > >> don't have transaction logs). But TLOG/PULL replicas
doesn't
> > > > support
> > > > > > > soft
> > > > > > > >> commits nor real time gets, afaik.
> > > > > > > >>
> > > > > > > >>>
> > > > > > > >>
> > > > > > > >>>
> > > > > > > >>> 60000
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> Should we say that in autoCommit section openSearcher
is
> > always
> > > > > true in
> > > > > > > >>> that case?
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> 10000
> > > > > > > >> 30000
> > > > > > > >> 512m
> > > > > > > >> false
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> Does it mean that new Searcher always starts on
all replicas
> > when
> > > > > hard
> > > > > > > >> commit happens on leader?
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> Nope. Or at least, the searcher is not synchronously
> created.
> > Each
> > > > > non
> > > > > > > >> leader replica will periodically fetch the index
changes
> from
> > the
> > > > > leader
> > > > > > > >> and open a new searcher to reflect those changes
as seen
> here:
> > > > > > > >> https://github.com/apache/lucene-
> > > > > > > >>
> > > > >
> > > solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > > > > >> rg/apache/solr/handler/IndexFetcher.java#L653
> > > > > > > >> But it's important to note that the potential
delay between
> > the
> > > > > leader's
> > > > > > > >> hard commit and the other replicas fetching those
changes
> > from the
> > > > > > > leader
> > > > > > > >> and opening a new searcher to reflect latest changes.
> > > > > > > >>
> > > > > > > >> PS: I am still digging these new replica types
so I can have
> > > > > > > misunderstood
> > > > > > > >> or missed some aspect of it.
> > > > > > > >>
> > > > > > > >> Regards,
> > > > > > > >> Edward
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message