kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jorge Esteban Quilcate Otoya <quilcate.jo...@gmail.com>
Subject Re: [DISCUSS] KIP-171: Extend Consumer Group Reset Offset for Stream Application
Date Mon, 09 Oct 2017 07:54:45 GMT
Matthias,

Thanks for the heads up!

I think the main dependency is from `StreamResseter` to
`ConsumerGroupCommand` class to actually reuse `#reset-offsets`
functionality.

Not sure what would be the better way to remove it. To expose commands
(e.g. `ConsumerGroupCommand`) as part of AdminClient, they have to be
re-implemented on the `client` module right? Is this an option? If not I
think we should keep `StreamResseter` as part of `core` module until we
have `ConsumerGroupCommand` on `client` module as well.

El vie., 6 oct. 2017 a las 0:05, Matthias J. Sax (<matthias@confluent.io>)
escribió:

> Jorge,
>
> KIP-198 (that got merged already) overlaps with this KIP. Can you please
> update your KIP accordingly?
>
> Also, while working on KIP-198, we identified some shortcomings in
> AdminClient that do not allow us to move StreamsResetter our of core
> package. We want to address those shortcoming in another KIP to add
> missing functionality to the new AdminClient.
>
> Having say this, and remembering a discussion about dependencies that
> might be introduced by this KIP, it might be good to understand those
> dependencies in detail. Maybe we can resolve those dependencies somehow
> and thus, be able to more StreamsResetter out of core package. Could you
> summarize those dependencies in the KIP or just as a reply?
>
> Thanks!
>
>
> -Matthias
>
> On 9/11/17 3:02 PM, Jorge Esteban Quilcate Otoya wrote:
> > Thanks Guozhang!
> >
> > I have updated the KIP to:
> >
> > 1. Only one scenario param is allowed. If none, `to-earliest` will be
> used,
> > behaving as the current version.
> >
> > 2.
> >   1. An exception will be printed mentioning that there is no existing
> > offsets registered.
> >   2. inputTopics format could support define partition numbers as in
> > reset-offsets option for kafka-consumer-groups.
> >
> > 3. That should be handled by KIP-198.
> >
> > I will start the VOTE thread in a following email.
> >
> >
> > El mié., 30 ago. 2017 a las 2:01, Guozhang Wang (<wangguoz@gmail.com>)
> > escribió:
> >
> >> Hi Jorge,
> >>
> >> Thanks for the KIP. It would be a great to add feature to the reset
> tools.
> >> I made a pass over it and it looks good to me overall. I have a few
> >> comments:
> >>
> >> 1. For all the scenarios, do we allow users to specify more than one
> >> parameters? If not could you make that clear in the wiki, e.g. we would
> >> return with an error message saying that only one is allowed; if yes
> then
> >> what precedence order we are following?
> >>
> >> 2. Personally I feel that "--by-duration", "--to-offset" and
> "--shift-by"
> >> are a tad overkill, because 1) they assume there exist some committed
> >> offset for each of the topic, but that may not be always true, 2)
> offset /
> >> time shifting amount on different topics may not be a good fit
> universally,
> >> i.e. one could imagine the we want to reset all input topics to their
> >> offsets of a given time, but resetting all topics' offset to the same
> value
> >> or let all of them shifting the same amount of offsets are usually not
> >> applicable. For "--by-duration" it seems could be easily supported by
> the
> >> "to-date".
> >>
> >> For the general consumer group reset tool, since it could be set one per
> >> partition these parameters may be more useful.
> >>
> >> 3. As for the implementation details, when removing zookeeper config in
> >> `kafka-streams-application-reset`, we should consider return a meaning
> >> error message otherwise it would be "unrecognized config" blah.
> >>
> >>
> >> If you feel confident about the wiki after discussing about these
> points,
> >> please feel free to move on to start a voting thread. Note that we are
> >> about 3 weeks away from KIP deadline and 4 weeks away from feature
> >> deadline.
> >>
> >>
> >> Guozhang
> >>
> >>
> >>
> >>
> >>
> >> On Tue, Aug 22, 2017 at 1:45 PM, Matthias J. Sax <matthias@confluent.io
> >
> >> wrote:
> >>
> >>> Thanks for the update Jorge.
> >>>
> >>> I don't have any further comments.
> >>>
> >>>
> >>> -Matthias
> >>>
> >>> On 8/12/17 6:43 PM, Jorge Esteban Quilcate Otoya wrote:
> >>>> I have updated the KIP:
> >>>>
> >>>> - Change execution parameters, using `--dry-run`
> >>>> - Reference KAFKA-4327
> >>>> - And advise about changes on `StreamResetter`
> >>>>
> >>>> Also includes that it will cover a change on `ConsumerGroupCommand`
to
> >>>> align execution options.
> >>>>
> >>>> El dom., 16 jul. 2017 a las 5:37, Matthias J. Sax (<
> >>> matthias@confluent.io>)
> >>>> escribió:
> >>>>
> >>>>> Thanks a lot for the update!
> >>>>>
> >>>>> I like the KIP!
> >>>>>
> >>>>> One more question about `--dry-run` vs `--execute`: While I agree
> that
> >>>>> we should use the same flag for both tools, I am not sure which
one
> is
> >>>>> the better one... My personal take is, that I like `--dry-run`
> better.
> >>>>> Not sure what others think.
> >>>>>
> >>>>> One more comment: with the removal of ZK, we can also tackle this
> >> JIRA:
> >>>>> https://issues.apache.org/jira/browse/KAFKA-4327 If we do so, I
> think
> >>> we
> >>>>> should mention it in the KIP.
> >>>>>
> >>>>> I am also not sure about backward compatibility issue for this case.
> >>>>> Actually, I don't expect people to call `StreamsResetter` from Java
> >>>>> code, but you can never know. So if we break this, we need to make
> >> sure
> >>>>> to cover it in the KIP and later on in the release notes.
> >>>>>
> >>>>>
> >>>>> -Matthias
> >>>>>
> >>>>> On 7/14/17 7:15 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> KIP is updated.
> >>>>>> Changes:
> >>>>>> 1. Approach discussed to keep both tools (streams application
> >> resetter
> >>>>> and
> >>>>>> consumer group reset offset).
> >>>>>> 2. Options has been aligned between both tools.
> >>>>>> 3. Zookeeper option from streams-application-resetted will be
> >> removed,
> >>>>> and
> >>>>>> replaced internally for Kafka AdminClient.
> >>>>>>
> >>>>>> Looking forward to your feedback.
> >>>>>>
> >>>>>> El jue., 29 jun. 2017 a las 15:04, Matthias J. Sax (<
> >>>>> matthias@confluent.io>)
> >>>>>> escribió:
> >>>>>>
> >>>>>>> Damian,
> >>>>>>>
> >>>>>>>> resets everything and clears up
> >>>>>>>>> the state stores.
> >>>>>>>
> >>>>>>> That's not correct. The reset tool does not touch local
store. For
> >>> this,
> >>>>>>> we have `KafkaStreams#clenup` -- otherwise, you would need
to run
> >> the
> >>>>>>> tool in every machine you run an application instance.
> >>>>>>>
> >>>>>>> With regard to state, the tool only deletes the underlying
> changelog
> >>>>>>> topics.
> >>>>>>>
> >>>>>>> Just to clarify. The idea of allowing to rest with different
offset
> >> is
> >>>>>>> to clear all state but just use a different start offset
(instead
> of
> >>>>>>> zero). This is for use case where your topic has a larger
retention
> >>> time
> >>>>>>> than the amount of data you want to reprocess. But we always
need
> to
> >>>>>>> start with an empty state. (Resetting with consistent state
is
> >>> something
> >>>>>>> we might do at some point, but it's much hard and not part
of this
> >>> KIP)
> >>>>>>>
> >>>>>>>> @matthias, could we remove the ZK dependency from the
streams
> reset
> >>>>> tool
> >>>>>>>> now?
> >>>>>>>
> >>>>>>> I think so. The new AdminClient provide the feature we need
AFAIK.
> I
> >>>>>>> guess we can picky back this into the KIP (we would need
a KIP
> >> anyway
> >>>>>>> because we deprecate `--zookeeper` what is an public API
change).
> >>>>>>>
> >>>>>>>
> >>>>>>> I just want to point out, that even without ZK dependency,
I prefer
> >> to
> >>>>>>> wrap the consumer offset tool and keep two tools.
> >>>>>>>
> >>>>>>>
> >>>>>>> -Matthias
> >>>>>>>
> >>>>>>>
> >>>>>>> On 6/29/17 9:14 AM, Damian Guy wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> Thanks for the KIP. What is not clear is how is this
going to
> >> handle
> >>>>>>> state
> >>>>>>>> stores? Right now the streams reset tool, resets everything
and
> >>> clears
> >>>>> up
> >>>>>>>> the state stores. What are we going to do if we reset
to a
> >> particular
> >>>>>>>> offset? If we clear the state then we've lost any previously
> >>> aggregated
> >>>>>>>> values (which may or may not be what is expected). If
we don't
> >> clear
> >>>>>>> them,
> >>>>>>>> then we will end up with incorrect aggregates.
> >>>>>>>>
> >>>>>>>> @matthias, could we remove the ZK dependency from the
streams
> reset
> >>>>> tool
> >>>>>>>> now?
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Damian
> >>>>>>>>
> >>>>>>>> On Thu, 29 Jun 2017 at 15:22 Jorge Esteban Quilcate
Otoya <
> >>>>>>>> quilcate.jorge@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>>> You're right, I was not considering Zookeeper dependency.
> >>>>>>>>>
> >>>>>>>>> I'm starting to like the idea to wrap `reset-offset`
from
> >>>>>>>>> `streams-application-reset`.
> >>>>>>>>>
> >>>>>>>>> I will wait a bit for more feedback and work on
a draft with this
> >>>>>>> changes.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> El mié., 28 jun. 2017 a las 0:20, Matthias J. Sax
(<
> >>>>>>> matthias@confluent.io
> >>>>>>>>>> )
> >>>>>>>>> escribió:
> >>>>>>>>>
> >>>>>>>>>> I agree, that we should not duplicate functionality.
> >>>>>>>>>>
> >>>>>>>>>> However, I am worried, that a non-streams users
using the offset
> >>>>> reset
> >>>>>>>>>> tool might delete topics unintentionally (even
if the changes
> are
> >>>>>>> pretty
> >>>>>>>>>> small). Also, currently the stream reset tool
required Zookeeper
> >>>>> while
> >>>>>>>>>> the offset reset tool does not -- I don't think
we should add
> >> this
> >>>>>>>>>> dependency to the offset reset tool.
> >>>>>>>>>>
> >>>>>>>>>> Thus, it think it might be better to keep both
tools, but
> >>> internally
> >>>>>>>>>> rewrite the streams reset entry class, to reuse
as much as
> >> possible
> >>>>>>> from
> >>>>>>>>>> the offset reset tool. Ie. the streams tool
would be a wrapper
> >>> around
> >>>>>>>>>> the offset tool and add some functionality it
needs that is
> >> Streams
> >>>>>>>>>> specific.
> >>>>>>>>>>
> >>>>>>>>>> I also think, that keeping separate tools for
consumers and
> >> streams
> >>>>> is
> >>>>>>> a
> >>>>>>>>>> good thing. We might want to add new features
that don't apply
> to
> >>>>> plain
> >>>>>>>>>> consumers -- note, a Streams applications is
more than just a
> >>> client.
> >>>>>>>>>>
> >>>>>>>>>> WDYT?
> >>>>>>>>>>
> >>>>>>>>>> Would be good to get some feedback from others,
too.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> -Matthias
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 6/27/17 9:05 AM, Jorge Esteban Quilcate Otoya
wrote:
> >>>>>>>>>>> Thanks for the feedback Matthias!
> >>>>>>>>>>>
> >>>>>>>>>>> The main idea is to use only 1 tool to reset
offsets and don't
> >>>>>>>>> replicate
> >>>>>>>>>>> functionality between tools.
> >>>>>>>>>>> Reset Offset (KIP-122) tool not only reset
but support execute
> >> the
> >>>>>>>>> reset
> >>>>>>>>>>> but also export, import from files, etc.
> >>>>>>>>>>> If we extend the current tool (kafka-streams-application-
> >>> reset.sh)
> >>>>> we
> >>>>>>>>>> will
> >>>>>>>>>>> have to duplicate all this functionality
also.
> >>>>>>>>>>> Maybe another option is to move the current
implementation into
> >>>>>>>>>>> `kafka-consumer-group` and add a new command
> >>>>> `--reset-offset-streams`
> >>>>>>>>>> with
> >>>>>>>>>>> the current implementation functionality
and add
> >> `--reset-offset`
> >>>>>>>>> options
> >>>>>>>>>>> for input topics. Does this make sense?
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> El lun., 26 jun. 2017 a las 23:32, Matthias
J. Sax (<
> >>>>>>>>>> matthias@confluent.io>)
> >>>>>>>>>>> escribió:
> >>>>>>>>>>>
> >>>>>>>>>>>> Jorge,
> >>>>>>>>>>>>
> >>>>>>>>>>>> thanks a lot for this KIP. Allowing
the reset streams
> >>> applications
> >>>>>>>>> with
> >>>>>>>>>>>> arbitrary start offset is something
we got multiple requests
> >>>>> already.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Couple of clarification question:
> >>>>>>>>>>>>
> >>>>>>>>>>>>  - why do you want to deprecate the
current tool instead of
> >>>>> extending
> >>>>>>>>>>>> the current tool with the stuff the
offset reset tool can do
> >> (ie,
> >>>>> use
> >>>>>>>>>>>> the offset reset tool internally)
> >>>>>>>>>>>>
> >>>>>>>>>>>>  - you suggest to extend the offset
reset tool to replace the
> >>>>> stream
> >>>>>>>>>>>> reset tool: how would the reset tool
know if it is resetting a
> >>>>>>> streams
> >>>>>>>>>>>> applications or a regular consumer group?
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 6/26/17 1:28 PM, Jorge Esteban Quilcate
Otoya wrote:
> >>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'd like to start the discussion
to add reset offset tooling
> >> for
> >>>>>>>>> Stream
> >>>>>>>>>>>>> applications.
> >>>>>>>>>>>>> The KIP can be found here:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>> 171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Jorge.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message