kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daniel.locious@gmail.com <daniel.loci...@gmail.com>
Subject Re: [DISCUSS] KIP-382: MirrorMaker 2.0
Date Mon, 19 Nov 2018 20:55:23 GMT
Hi guys,

This is an exciting topic. could I have a word here?
I understand there are many scenarios that we can apply mirrormaker. I am at the moment working
on active/active DC solution using MirrorMaker; our goal is to allow  the clients to failover
to connect the other kafka cluster (on the other DC) when an incident happens.

To do this, I need:
1 MirrorMaker to replicate the partitioned messages in a sequential order (in timely fashion)
to the same partition on the other cluster (also need keep the promise that both clusters
creates the same number of partitions for a topic) – so that a consumer can pick up the
right order of the latest messages 
2 MirrorMaker to replicate the local consumer offset to the other side – so that the consumer
knows where is the offset/ latest messages 
3 MirrorMaker to provide cycle detection for messages across the DCs.

I can see the possibility for Remote Topic to solve all these problems, as long as the consumer
can see the remote topic equally as the local topic, i.e. For a consumer which has a permission
to consume topic1, on subscribe event it can automatically subscribe both remote.topic1 and
local.topic1. First we need to find a way for topic ACL granting for the consumer across the
DCs. Secondly the consumer need to be able to subscribe topics with wildcard or suffix. Last
but not the least, the consumer has to deal with the timely ordering of the messages from
the 2 topics.

My understanding is, all of these should be configurable to be turned on or off, to fit for
different use cases.

Interesting I was going to support topic messages with extra headers of source DC info, for
cycle detection…..

Looking forward your reply.

Regards,

Dan
On 2018/10/23 19:56:02, Ryanne Dolan <ryannedolan@gmail.com> wrote: 
> Alex, thanks for the feedback.
> 
> > Would it be possible to utilize the
> > Message Headers feature to prevent infinite recursion
> 
> This isn't necessary due to the topic renaming feature which already
> prevents infinite recursion.
> 
> If you turn off topic renaming you lose cycle detection, so maybe we could
> provide message headers as an optional second mechanism. I'm not opposed to
> that idea, but there are ways to improve efficiency if we don't need to
> modify or inspect individual records.
> 
> Ryanne
> 
> On Tue, Oct 23, 2018 at 6:06 AM Alex Mironov <alexandrfox@gmail.com> wrote:
> 
> > Hey Ryanne,
> >
> > Awesome KIP, exited to see improvements in MirrorMaker land, I particularly
> > like the reuse of Connect framework! Would it be possible to utilize the
> > Message Headers feature to prevent infinite recursion? For example, MM2
> > could stamp every message with a special header payload (e.g.
> > MM2="cluster-name-foo") so in case another MM2 instance sees this message
> > and it is configured to replicate data into "cluster-name-foo" it would
> > just skip it instead of replicating it back.
> >
> > On Sat, Oct 20, 2018 at 5:48 AM Ryanne Dolan <ryannedolan@gmail.com>
> > wrote:
> >
> > > Thanks Harsha. Done.
> > >
> > > On Fri, Oct 19, 2018 at 1:03 AM Harsha Chintalapani <kafka@harsha.io>
> > > wrote:
> > >
> > > > Ryanne,
> > > >        Makes sense. Can you please add this under rejected alternatives
> > > so
> > > > that everyone has context on why it  wasn’t picked.
> > > >
> > > > Thanks,
> > > > Harsha
> > > > On Oct 18, 2018, 8:02 AM -0700, Ryanne Dolan <ryannedolan@gmail.com>,
> > > > wrote:
> > > >
> > > > Harsha, concerning uReplicator specifically, the project is a major
> > > > inspiration for MM2, but I don't think it is a good foundation for
> > > anything
> > > > included in Apache Kafka. uReplicator uses Helix to solve problems that
> > > > Connect also solves, e.g. REST API, live configuration changes, cluster
> > > > management, coordination etc. This also means that existing tooling,
> > > > dashboards etc that work with Connectors do not work with uReplicator,
> > > and
> > > > any future tooling would need to treat uReplicator as a special case.
> > > >
> > > > Ryanne
> > > >
> > > > On Wed, Oct 17, 2018 at 12:30 PM Ryanne Dolan <ryannedolan@gmail.com>
> > > > wrote:
> > > >
> > > >> Harsha, yes I can do that. I'll update the KIP accordingly, thanks.
> > > >>
> > > >> Ryanne
> > > >>
> > > >> On Wed, Oct 17, 2018 at 12:18 PM Harsha <kafka@harsha.io> wrote:
> > > >>
> > > >>> Hi Ryanne,
> > > >>>                Thanks for the KIP. I am also curious about why
not
> > use
> > > >>> the uReplicator design as the foundation given it alreadys resolves
> > > some of
> > > >>> the fundamental issues in current MIrrorMaker, updating the confifgs
> > > on the
> > > >>> fly and running the mirror maker agents in a worker model which
can
> > > >>> deployed in mesos or container orchestrations.  If possible can
you
> > > >>> document in the rejected alternatives what are missing parts that
> > made
> > > you
> > > >>> to consider a new design from ground up.
> > > >>>
> > > >>> Thanks,
> > > >>> Harsha
> > > >>>
> > > >>> On Wed, Oct 17, 2018, at 8:34 AM, Ryanne Dolan wrote:
> > > >>> > Jan, these are two separate issues.
> > > >>> >
> > > >>> > 1) consumer coordination should not, ideally, involve unreliable
or
> > > >>> slow
> > > >>> > connections. Naively, a KafkaSourceConnector would coordinate
via
> > the
> > > >>> > source cluster. We can do better than this, but I'm deferring
this
> > > >>> > optimization for now.
> > > >>> >
> > > >>> > 2) exactly-once between two clusters is mind-bending. But
keep in
> > > mind
> > > >>> that
> > > >>> > transactions are managed by the producer, not the consumer.
In
> > fact,
> > > >>> it's
> > > >>> > the producer that requests that offsets be committed for
the
> > current
> > > >>> > transaction. Obviously, these offsets are committed in whatever
> > > >>> cluster the
> > > >>> > producer is sending to.
> > > >>> >
> > > >>> > These two issues are closely related. They are both resolved
by not
> > > >>> > coordinating or committing via the source cluster. And in
fact,
> > this
> > > >>> is the
> > > >>> > general model of SourceConnectors anyway, since most
> > SourceConnectors
> > > >>> > _only_ have a destination cluster.
> > > >>> >
> > > >>> > If there is a lot of interest here, I can expound further
on this
> > > >>> aspect of
> > > >>> > MM2, but again I think this is premature until this first
KIP is
> > > >>> approved.
> > > >>> > I intend to address each of these in separate KIPs following
this
> > > one.
> > > >>> >
> > > >>> > Ryanne
> > > >>> >
> > > >>> > On Wed, Oct 17, 2018 at 7:09 AM Jan Filipiak <
> > > Jan.Filipiak@trivago.com
> > > >>> >
> > > >>> > wrote:
> > > >>> >
> > > >>> > > This is not a performance optimisation. Its a fundamental
design
> > > >>> choice.
> > > >>> > >
> > > >>> > >
> > > >>> > > I never really took a look how streams does exactly
once. (its a
> > > trap
> > > >>> > > anyways and you usually can deal with at least once
donwstream
> > > pretty
> > > >>> > > easy). But I am very certain its not gonna get somewhere
if
> > offset
> > > >>> > > commit and record produce cluster are not the same.
> > > >>> > >
> > > >>> > > Pretty sure without this _design choice_ you can skip
on that
> > > exactly
> > > >>> > > once already
> > > >>> > >
> > > >>> > > Best Jan
> > > >>> > >
> > > >>> > > On 16.10.2018 18:16, Ryanne Dolan wrote:
> > > >>> > > >  >  But one big obstacle in this was
> > > >>> > > > always that group coordination happened on the
source cluster.
> > > >>> > > >
> > > >>> > > > Jan, thank you for bringing up this issue with
legacy
> > > MirrorMaker.
> > > >>> I
> > > >>> > > > totally agree with you. This is one of several
problems with
> > > >>> MirrorMaker
> > > >>> > > > I intend to solve in MM2, and I already have a
design and
> > > >>> prototype that
> > > >>> > > > solves this and related issues. But as you pointed
out, this
> > KIP
> > > is
> > > >>> > > > already rather complex, and I want to focus on
the core feature
> > > set
> > > >>> > > > rather than performance optimizations for now.
If we can agree
> > on
> > > >>> what
> > > >>> > > > MM2 looks like, it will be very easy to agree to
improve its
> > > >>> performance
> > > >>> > > > and reliability.
> > > >>> > > >
> > > >>> > > > That said, I look forward to your support on a
subsequent KIP
> > > that
> > > >>> > > > addresses consumer coordination and rebalance issues.
Stay
> > tuned!
> > > >>> > > >
> > > >>> > > > Ryanne
> > > >>> > > >
> > > >>> > > > On Tue, Oct 16, 2018 at 6:58 AM Jan Filipiak <
> > > >>> Jan.Filipiak@trivago.com
> > > >>> > > > <mailto:Jan.Filipiak@trivago.com>> wrote:
> > > >>> > > >
> > > >>> > > >     Hi,
> > > >>> > > >
> > > >>> > > >     Currently MirrorMaker is usually run collocated
with the
> > > target
> > > >>> > > >     cluster.
> > > >>> > > >     This is all nice and good. But one big obstacle
in this was
> > > >>> > > >     always that group coordination happened on
the source
> > > cluster.
> > > >>> So
> > > >>> > > when
> > > >>> > > >     then network was congested, you sometimes loose
group
> > > >>> membership and
> > > >>> > > >     have to rebalance and all this.
> > > >>> > > >
> > > >>> > > >     So one big request from we would be the support
of having
> > > >>> > > coordination
> > > >>> > > >     cluster != source cluster.
> > > >>> > > >
> > > >>> > > >     I would generally say a LAN is better than
a WAN for doing
> > > >>> group
> > > >>> > > >     coordinaton and there is no reason we couldn't
have a group
> > > >>> consuming
> > > >>> > > >     topics from a different cluster and committing
offsets to
> > > >>> another
> > > >>> > > >     one right?
> > > >>> > > >
> > > >>> > > >     Other than that. It feels like the KIP has
too much
> > features
> > > >>> where
> > > >>> > > many
> > > >>> > > >     of them are not really wanted and counter productive
but I
> > > >>> will just
> > > >>> > > >     wait and see how the discussion goes.
> > > >>> > > >
> > > >>> > > >     Best Jan
> > > >>> > > >
> > > >>> > > >
> > > >>> > > >     On 15.10.2018 18:16, Ryanne Dolan wrote:
> > > >>> > > >      > Hey y'all!
> > > >>> > > >      >
> > > >>> > > >      > Please take a look at KIP-382:
> > > >>> > > >      >
> > > >>> > > >      >
> > > >>> > > >
> > > >>> > >
> > > >>>
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0
> > > >>> > > >      >
> > > >>> > > >      > Thanks for your feedback and support.
> > > >>> > > >      >
> > > >>> > > >      > Ryanne
> > > >>> > > >      >
> > > >>> > > >
> > > >>> > >
> > > >>>
> > > >>
> > >
> >
> >
> > --
> > Best,
> > Alex Mironov
> >
> 

Mime
View raw message