apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From amol kekre <amolhke...@gmail.com>
Subject Re: Support new major kafka release 0.9.0
Date Wed, 25 Nov 2015 04:02:12 GMT
We do need the Kafka operator to be congnizant of Kafka versions. I am
assuming that is part of the rewrite. I added my comments to the jira.

Thks,
Amol


On Tue, Nov 24, 2015 at 4:43 PM, Pramod Immaneni <pramod@datatorrent.com>
wrote:

> Nice.
>
> On Tue, Nov 24, 2015 at 4:32 PM, Thomas Weise <thomas@datatorrent.com>
> wrote:
>
> > The new API provides for explicit offset storage on the broker side (see
> > commitXXX methods):
> >
> >
> >
> http://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html
> >
> > With this, we won't need the current offset manager approach.
> >
> > On Tue, Nov 24, 2015 at 1:32 PM, Pramod Immaneni <pramod@datatorrent.com
> >
> > wrote:
> >
> > > What support does the new API provide for offset management?
> > >
> > > On Tue, Nov 24, 2015 at 1:09 PM, Thomas Weise <thomas@datatorrent.com>
> > > wrote:
> > >
> > > > This discussion applies to Kafka 0.8.x only?
> > > >
> > > > With the new consumer API, offset management can be delegated, we
> won't
> > > > need this component any longer. Each partition can record the offset
> > for
> > > > the committed window then.
> > > >
> > > > Thomas
> > > >
> > > > On Tue, Nov 24, 2015 at 12:59 PM, Siyuan Hua <siyuan@datatorrent.com
> >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I need your idea for the design of OffsetManager for kafka input
> > > operator
> > > > >
> > > > > First of all, some background of OffsetManager and why we may need
> > it.
> > > > The
> > > > > OffsetManager is a plugin in kafka input operator for cutomized
> > offset
> > > > > management. The API will be called if the consumer offset(the
> message
> > > > that
> > > > > has been emitted, along with the window id) changes.
> > > > > OffsetManager is different from offset checkpointing, we still use
> > > > > checkpointed offset to recover node from failure.
> > > > >
> > > > > 2 reasons for the need of OffsetManager:
> > > > >   1) User might want to store offsets in their own way (hdfs,
> > > zookeeper,
> > > > > database, etc)
> > > > >   2) User might want to continue consuming at application restart.
> > > > >
> > > > > In the current version, the OffsetManager works in a central mode,
> > each
> > > > > partition only reports the offset(s) to Statslistener, the listener
> > > calls
> > > > > OffsetManager to update the offsets.
> > > > > The other possibility is make the OffsetManager work in a
> distributed
> > > > > mode.  Each partition update the offset(s) on its own.
> > > > >
> > > > > The distributed mode is more straightforward, but the developer
> needs
> > > to
> > > > > know it's distributed, you have to manage write from multiple
> nodes,
> > > > > collisions on your own. But also it's more real time at no risk of
> > > > failure
> > > > > of stats reporter
> > > > >
> > > > > Any input is welcome, thanks!
> > > > >
> > > > > Regards,
> > > > > Siyuan
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Nov 16, 2015 at 10:53 AM, Siyuan Hua <
> siyuan@datatorrent.com
> > >
> > > > > wrote:
> > > > >
> > > > > > I will be working on rewriting the kafka input operator.
> > > > > >
> > > > > > Here is the ticket
> > > > > > https://malhar.atlassian.net/browse/MLHR-1904
> > > > > >
> > > > > > Here is some comments on the ticket
> > > > > >
> > > > > > The RC2 is out here
> > > > > > https://people.apache.org/~junrao/kafka-0.9.0.0-candidate2/
> > > > > >
> > > > > > We will keep most features of the old input operator but the
> > internal
> > > > > > mechanism will be changed, for example, using new API to refresh
> > the
> > > > > > metadata
> > > > > > The bugs that will be fixed:
> > > > > >
> > > > > >    - Synchronized offset checkpoint
> > > > > >    - Transient offsetmanager
> > > > > >
> > > > > > New features:
> > > > > >
> > > > > >    - Support customized partition schema
> > > > > >    - Default OffsetManager using new
> > > > > >
> > > > > > Improvement
> > > > > >
> > > > > >    - Add window id and application name to OffsetManager
> interface
> > > > > >    - Support multi-topic
> > > > > >    - Easy configuration
> > > > > >
> > > > > >
> > > > > > Please leave thoughts here or on the ticket. Thanks
> > > > > >
> > > > > > Best,
> > > > > > Siyuan
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message