kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayuresh Gharat <gharatmayures...@gmail.com>
Subject Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication
Date Mon, 26 Nov 2018 21:16:25 GMT
Hi Edoardo,

Thanks a lot for the KIP.
 I have a few questions/suggestions in addition to what Radai has mentioned
above :

   1. Is this meant only for 1:1 replication, for example one Kafka cluster
   replicating to other, instead of having multiple Kafka clusters mirroring
   into one Kafka cluster?
   2. Are we relying on exactly once produce in the replicator? If not, how
   are retries handled in the replicator ?
   3. What is the recommended value for inflight requests, here. Is it
   suppose to be strictly 1, if yes, it would be great to mention that in the
   KIP.
   4. How is unclean Leader election between source cluster and destination
   cluster handled?
   5. How are offsets resets in case of the replicator's consumer handled?
   6. It would be good to explain the workflow in the KIP, with an
   example,  regarding how this KIP will change the replication scenario and
   how it will benefit the consumer apps.

Thanks,

Mayuresh

On Mon, Nov 26, 2018 at 8:08 AM radai <radai.rosenblatt@gmail.com> wrote:

> a few questions:
>
> 1. how do you handle possible duplications caused by the "special"
> producer timing-out/retrying? are you explicitely relying on the
> "exactly once" sequencing?
> 2. what about the combination of log compacted topics + replicator
> downtime? by the time the replicator comes back up there might be
> "holes" in the source offsets (some msgs might have been compacted
> out)? how is that recoverable?
> 3. similarly, what if you try and fire up replication on a non-empty
> source topic? does the kip allow for offsets starting at some
> arbitrary X > 0 ? or would this have to be designed from the start.
>
> and lastly, since this KIP seems to be designed fro active-passive
> failover (there can be no produce traffic except the replicator)
> wouldnt a solution based on seeking to a time offset be more generic?
> your producers could checkpoint the last (say log append) timestamp of
> records theyve seen, and when restoring in the remote site seek to
> those timestamps (which will be metadata in their committed offsets) -
> assumming replication takes > 0 time you'd need to handle some dups,
> but every kafka consumer setup needs to know how to handle those
> anyway.
> On Fri, Nov 23, 2018 at 2:27 AM Edoardo Comar <ECOMAR@uk.ibm.com> wrote:
> >
> > Hi Stanislav
> >
> > > > The flag is needed to distinguish a batch with a desired base offset
> > of
> > > 0,
> > > from a regular batch for which offsets need to be generated.
> > > If the producer can provide offsets, why not provide a base offset of
> 0?
> >
> > a regular batch (for which offsets are generated by the broker on write)
> > is sent with a base offset of 0.
> > How could you distinguish it from a batch where you *want* the first
> > record to be written at offset 0 (i.e. be the first in the partition and
> > be rejected if there are records on the log already) ?
> > We wanted to avoid a "deep" inspection (and potentially decompression) of
> > the records.
> >
> > For the replicator use case, a single produce request where all the data
> > is to be assumed with offset,
> > or all without offsets, seems to suffice,
> > So we added only a toplevel flag, not a per-topic-partition one.
> >
> > Thanks for your interest !
> > cheers
> > Edo
> > --------------------------------------------------
> >
> > Edoardo Comar
> >
> > IBM Event Streams
> > IBM UK Ltd, Hursley Park, SO21 2JN
> >
> >
> > Stanislav Kozlovski <stanislav@confluent.io> wrote on 22/11/2018
> 22:32:42:
> >
> > > From: Stanislav Kozlovski <stanislav@confluent.io>
> > > To: dev@kafka.apache.org
> > > Date: 22/11/2018 22:33
> > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > Cluster Replication
> > >
> > > Hey Edo & Mickael,
> > >
> > > > The flag is needed to distinguish a batch with a desired base offset
> > of
> > > 0,
> > > from a regular batch for which offsets need to be generated.
> > > If the producer can provide offsets, why not provide a base offset of
> 0?
> > >
> > > > (I am reading your post thinking about
> > > partitions rather than topics).
> > > Yes, I meant partitions. Sorry about that.
> > >
> > > Thanks for answering my questions :)
> > >
> > > Best,
> > > Stanislav
> > >
> > > On Thu, Nov 22, 2018 at 5:28 PM Edoardo Comar <ECOMAR@uk.ibm.com>
> wrote:
> > >
> > > > Hi Stanislav,
> > > >
> > > > you're right we envision the replicator use case to have a single
> > producer
> > > > with offsets per partition (I am reading your post thinking about
> > > > partitions rather than topics).
> > > >
> > > > If a regular producer was to send its own records at the same time,
> > it's
> > > > very likely that the one sending with an offset will fail because of
> > > > invalid offsets.
> > > > Same if two producers were sending with offsets, likely both would
> > then
> > > > fail.
> > > >
> > > > > Does it make sense to *lock* the topic from other producers while
> > there
> > > > is
> > > > > one that uses offsets?
> > > >
> > > > You could do that with ACL permissions if you wanted, I don't think
> it
> > > > needs to be mandated by changing the broker logic.
> > > >
> > > >
> > > > > Since we are tying the produce-with-offset request to the ACL, do
> we
> > > > need
> > > > > the `use_offset` field in the produce request? Maybe we make it
> > > > mandatory
> > > > > for produce requests with that ACL to have offsets.
> > > >
> > > > The flag is needed to distinguish a batch with a desired base offset
> > of 0,
> > > > from a regular batch for which offsets need to be generated.
> > > > I would not restrict a principal to only send-with-offsets (by making
> > that
> > > > mandatory via the ACL).
> > > >
> > > > Thanks
> > > > Edo & Mickael
> > > >
> > > > --------------------------------------------------
> > > >
> > > > Edoardo Comar
> > > >
> > > > IBM Event Streams
> > > > IBM UK Ltd, Hursley Park, SO21 2JN
> > > >
> > > >
> > > > Stanislav Kozlovski <stanislav@confluent.io> wrote on 22/11/2018
> > 16:17:11:
> > > >
> > > > > From: Stanislav Kozlovski <stanislav@confluent.io>
> > > > > To: dev@kafka.apache.org
> > > > > Date: 22/11/2018 16:17
> > > > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > > > Cluster Replication
> > > > >
> > > > > Hey Edurdo, thanks for the KIP!
> > > > >
> > > > > I have some questions, apologies if they are naive:
> > > > > Is this intended to work for a single producer use case only?
> > > > > How would it work if two producers were producing to the same topic
> > with
> > > > > offsets?
> > > > > How would it work if two producers, one with offsets and one
> without
> > > > were
> > > > > producing to a topic?
> > > > > Does it make sense to *lock* the topic from other producers while
> > there
> > > > is
> > > > > one that uses offsets?
> > > > >
> > > > > Since we are tying the produce-with-offset request to the ACL, do
> we
> > > > need
> > > > > the `use_offset` field in the produce request? Maybe we make it
> > > > mandatory
> > > > > for produce requests with that ACL to have offsets.
> > > > >
> > > > > Best,
> > > > > Stanislav
> > > > >
> > > > > On Wed, Nov 21, 2018 at 5:14 PM Edoardo Comar <ECOMAR@uk.ibm.com>
> > wrote:
> > > > >
> > > > > > Hi,
> > > > > > we've opened a KIP to improve data replication between Kafka
> > clusters
> > > > :
> > > > > >
> > > > > >
> > > > > > INVALID URI REMOVED
> > > > >
> > > >
> > > >
> > >
> >
> u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D391-253A-2BAllow-2BProducing-2Bwith-2BOffsets-2Bfor-2BCluster-2BReplication&d=DwIBaQ&c=jf_iaSHvJObTbx-
> > > > >
> > > >
> > siA1ZOg&r=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=uUj9C3BdbYz0dDNA-
> > > > >
> > > >
> >
> E6iXreg1M5hWiWgG6ClS86VIPI&s=Vav8_-N7_OpfYEW33yGOf_or8ESMUJ4S45t2g-EUWKg&e=
> > > > > >
> > > > > > We'd like to start a discussion, please post your feedback in
> this
> > > > thread.
> > > > > >
> > > > > > Thank you
> > > > > > Edo and Mickael
> > > > > >
> > > > > >
> > > > > > --------------------------------------------------
> > > > > >
> > > > > > Edoardo Comar
> > > > > >
> > > > > > IBM Event Streams
> > > > > > IBM UK Ltd, Hursley Park, SO21 2JN
> > > > > >
> > > > > > Unless stated otherwise above:
> > > > > > IBM United Kingdom Limited - Registered in England and Wales
with
> > > > number
> > > > > > 741598.
> > > > > > Registered office: PO Box 41, North Harbour, Portsmouth,
> Hampshire
> > PO6
> > > > 3AU
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best,
> > > > > Stanislav
> > > >
> > > > Unless stated otherwise above:
> > > > IBM United Kingdom Limited - Registered in England and Wales with
> > number
> > > > 741598.
> > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
> PO6
> > 3AU
> > > >
> > >
> > >
> > > --
> > > Best,
> > > Stanislav
> >
> > Unless stated otherwise above:
> > IBM United Kingdom Limited - Registered in England and Wales with number
> > 741598.
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
> 3AU
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message