kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryanne Dolan <ryannedo...@gmail.com>
Subject Re: [VOTE] KIP-382 MirrorMaker 2.0
Date Mon, 07 Jan 2019 23:49:55 GMT
Thanks Jun, I've updated the KIP as requested. Brief notes below:

100. added "...out-of-the-box (without custom handlers)..."

101. done. Good idea to include a MessageFormatter.

102. done.

> 103. [...] why is Heartbeat a separate connector?

Heartbeats themselves are replicated via MirrorSource/SinkConnector, so if
replication stops, you'll stop seeing heartbeats in downstream clusters.
I've updated the KIP to make this clearer and have added a bullet to
Rejected Alternatives.

104. added "heartbeat.retention.ms", "checkpoint.retention.ms", thanks. The
heartbeat topic doesn't need to be compacted.

> 105. [...] I am not sure why targetClusterAlias is useful

In order to map A's B.topic1 to B's topic1, we need to know B.

> 106. [...] should the following properties be prefixed with "consumer."

No, they are part of Connect's worker config.

> 107. So, essentially it's running multiple logical connect clusters on
the same shared worker nodes?

Correct. Rather than configure each Connector and Worker and Herder
individually, a single top-level configuration file is used. And instead of
running a bunch of separate worker processes on each node, a single process
runs multiple workers. This is implemented using a separate driver based on
ConnectDistributed, but which runs multiple DistributedHerders. Each
DistributedHerder uses a different Kafka cluster for coordination -- they
are completely separate apart from running in the same process.

Thanks for helping improve the doc!
Ryanne

On Fri, Jan 4, 2019 at 10:33 AM Jun Rao <jun@confluent.io> wrote:

> Hi, Ryanne,
>
> Thanks for KIP.  Still have a few more comments below.
>
> 100. "This is not possible with MirrorMaker today -- records would be
> replicated back and forth indefinitely, and the topics in either cluster
> would be merged inconsistently between clusters. " This is not 100% true
> since MM can do the topic renaming through MirrorMakerMessageHandler.
>
> 101. For both Heartbeat and checkpoint, could you define the full schema,
> including the field type? Also how are they serialized into the Kafka
> topic? Is it JSON or sth else? For convenience, it would be useful to
> provide a built-in MessageFormatter so that one can read each topic's data
> using tools like ConsoleConsumer.
>
> 102. For the public Heartbeat and Checkpoint class, could you list the
> public methods in each class?
>
> 103. I am wondering why is Heartbeat a separate connector? A MirrorMaker
> connector can die independent of the Heartbeat connector, which seems to
> defeat the purpose of heartbeat.
>
> 104. Is the Heartbeat topic also a compacted topic? If not, how long is it
> retained for?
>
> 105. For the following, I am not sure why targetClusterAlias is useful? The
> checkpoint file seems to only include sourceClusterAlias.
>
> Map<TopicPartition, Long> translateOffsets(Map<?, ?> targetConsumerConfig,
> String sourceClusterAlias, String targetClusterAlias, String remoteGroupId)
>
> 106. In the configuration example, should the following properties be
> prefixed with "consumer."?
> key.converter
> <https://cwiki.apache.org/confluence/display/KAFKA/key.converter> =
> org.apache.kafka.connect.converters.ByteArrayConverter
> <
> https://cwiki.apache.org/confluence/display/KAFKA/org.apache.kafka.connect.converters.ByteArrayConverter
> >
> value.converter
> <https://cwiki.apache.org/confluence/display/KAFKA/value.converter> =
> org.apache.kafka.connect.converters.ByteArrayConverter
> <
> https://cwiki.apache.org/confluence/display/KAFKA/org.apache.kafka.connect.converters.ByteArrayConverter
> >
>
> 107. Could you add a bit more description on how connect-mirror-maker.sh is
> implemented? My understanding is that it will start as many as
> separate DistributedHerder as the Kafka clusters? So, essentially it's
> running multiple logical connect clusters on the same shared worker nodes?
>
> Thanks,
>
> Jun
>
>
> On Thu, Dec 20, 2018 at 5:23 PM Srinivas Reddy <srinivas96alluri@gmail.com
> >
> wrote:
>
> > +1 (non binding)
> >
> > Thank you Ryan for the KIP, let me know if you need support in
> implementing
> > it.
> >
> > -
> > Srinivas
> >
> > - Typed on tiny keys. pls ignore typos.{mobile app}
> >
> >
> > On Fri, 21 Dec, 2018, 08:26 Ryanne Dolan <ryannedolan@gmail.com wrote:
> >
> > > Thanks for the votes so far!
> > >
> > > Due to recent discussions, I've removed the high-level REST API from
> the
> > > KIP.
> > >
> > > On Thu, Dec 20, 2018 at 12:42 PM Paul Davidson <
> pdavidson@salesforce.com
> > >
> > > wrote:
> > >
> > > > +1
> > > >
> > > > Would be great to see the community build on the basic approach we
> took
> > > > with Mirus. Thanks Ryanne.
> > > >
> > > > On Thu, Dec 20, 2018 at 9:01 AM Andrew Psaltis <
> > psaltis.andrew@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Really looking forward to this and to helping in any way I can.
> > Thanks
> > > > for
> > > > > kicking this off Ryanne.
> > > > >
> > > > > On Thu, Dec 20, 2018 at 10:18 PM Andrew Otto <otto@wikimedia.org>
> > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > This looks like a huge project! Wikimedia would be very excited
> to
> > > have
> > > > > > this. Thanks!
> > > > > >
> > > > > > On Thu, Dec 20, 2018 at 9:52 AM Ryanne Dolan <
> > ryannedolan@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hey y'all, please vote to adopt KIP-382 by replying +1
to this
> > > > thread.
> > > > > > >
> > > > > > > For your reference, here are the highlights of the proposal:
> > > > > > >
> > > > > > > - Leverages the Kafka Connect framework and ecosystem.
> > > > > > > - Includes both source and sink connectors.
> > > > > > > - Includes a high-level driver that manages connectors
in a
> > > dedicated
> > > > > > > cluster.
> > > > > > > - High-level REST API abstracts over connectors between
> multiple
> > > > Kafka
> > > > > > > clusters.
> > > > > > > - Detects new topics, partitions.
> > > > > > > - Automatically syncs topic configuration between clusters.
> > > > > > > - Manages downstream topic ACL.
> > > > > > > - Supports "active/active" cluster pairs, as well as any
number
> > of
> > > > > active
> > > > > > > clusters.
> > > > > > > - Supports cross-data center replication, aggregation,
and
> other
> > > > > complex
> > > > > > > topologies.
> > > > > > > - Provides new metrics including end-to-end replication
latency
> > > > across
> > > > > > > multiple data centers/clusters.
> > > > > > > - Emits offsets required to migrate consumers between clusters.
> > > > > > > - Tooling for offset translation.
> > > > > > > - MirrorMaker-compatible legacy mode.
> > > > > > >
> > > > > > > Thanks, and happy holidays!
> > > > > > > Ryanne
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Paul Davidson
> > > > Principal Engineer, Ajna Team
> > > > Big Data & Monitoring
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message