kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Becket Qin <becket....@gmail.com>
Subject Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
Date Mon, 07 Nov 2016 18:48:59 GMT
Hi Michael,

Yes, changing the logic in the log cleaner makes sense. There could be some
other thing worth thinking (e.g. the message size change after conversion),
though.

The scenario I was thinking is the following:
Imagine a distributed caching system built on top of Kafka. A user is
consuming from a topic and it is guaranteed that if the user consume to the
log end it will get the latest value for all the keys. Currently if the
consumer sees a null value it knows the key has been removed. Now let's say
we rolled out this change. And the producer applies a message with the
tombstone flag set, but the value was not null. When we append that message
to the log I suppose we will not do the down conversion if the broker has
set the message.format.version to the latest. Because the log cleaner won't
touch the active log segment, so that message will be sitting in the active
segment as is. Now when a consumer that hasn't upgraded yet consumes that
tombstone message in the active segment, it seems that the broker will need
to down convert that message to remove the value, right? In this case, we
cannot wait for the log cleaner to do the down conversion because that
message may have already been consumed before the log compaction happens.

Thanks,

Jiangjie (Becket) Qin



On Mon, Nov 7, 2016 at 9:59 AM, Michael Pearce <Michael.Pearce@ig.com>
wrote:

> Hi Becket,
>
> We were thinking more about having the logic that’s in the method
> shouldRetainMessage configurable via http://kafka.apache.org/
> documentation.html#brokerconfigs  at a broker/topic level. And then scrap
> auto converting the message, and allow organisations to manage the rollout
> of enabling of the feature.
> (this isn’t in documentation but in response to the discussion thread as
> an alternative approach to roll out the feature)
>
> Does this make any more sense?
>
> Thanks
> Mike
>
> On 11/3/16, 2:27 PM, "Becket Qin" <becket.qin@gmail.com> wrote:
>
>     Hi Michael,
>
>     Do you mean using a new configuration it is just the exiting
>     message.format.version config? It seems the message.format.version
> config
>     is enough in this case. And the default value would always be the
> latest
>     version.
>
>     > Message version migration would be handled as like in KIP-32
>
>     Also just want to confirm on this. Today if an old consumer consumes a
> log
>     compacted topic and sees an empty value, it knows that is a tombstone.
>     After we start to use the attribute bit, a tombstone message can have a
>     non-empty value. So by "like in KIP-32" you mean we will remove the
> value
>     to down convert the message if the consumer version is old, right?
>
>     Thanks.
>
>     Jiangjie (Becket) Qin
>
>     On Wed, Nov 2, 2016 at 1:37 AM, Michael Pearce <Michael.Pearce@ig.com>
>     wrote:
>
>     > Hi Joel , et al.
>     >
>     > Any comments on the below idea to handle roll out / compatibility of
> this
>     > feature, using a configuration?
>     >
>     > Does it make sense/clear?
>     > Does it add value?
>     > Do we want to enforce flag by default, or value by default, or both?
>     >
>     > Cheers
>     > Mike
>     >
>     >
>     > On 10/27/16, 4:47 PM, "Michael Pearce" <Michael.Pearce@ig.com>
> wrote:
>     >
>     >     Thanks, James, I think this is a really good addition to the KIP
>     > details, please feel free to amend the wiki/add the use cases, also
> if any
>     > others you think of. I definitely think its worthwhile documenting.
> If you
>     > can’t let me know ill add them next week (just leaving for a long
> weekend
>     > off)
>     >
>     >     Re Joel and others comments about upgrade and compatibility.
>     >
>     >     Rather than trying to auto manage this.
>     >
>     >     Actually maybe we make a configuration option, both at server
> and per
>     > topic level to control the behavior of how the server logic should
> work out
>     > if the record, is a tombstone record .
>     >
>     >     e.g.
>     >
>     >     key = compation.tombstone.marker
>     >
>     >     value options:
>     >
>     >     value   (continues to use null value as tombstone marker)
>     >     flag (expects to use the tombstone flag)
>     >     value_or_flag (if either is true it treats the record as a
> tombstone)
>     >
>     >     This way on upgrade users can keep current behavior, and slowly
>     > migrate to the new. Having a transition period of using
> value_or_flag,
>     > finally having flag only if an organization wishes to use null values
>     > without it being treated as a tombstone marker (use case noted below)
>     >
>     >     Having it both global broker level and topic override also
> allows some
>     > flexibility here.
>     >
>     >     Cheers
>     >     Mike
>     >
>     >
>     >
>     >
>     >
>     >
>     >     On 10/27/16, 8:03 AM, "James Cheng" <wushujames@gmail.com>
> wrote:
>     >
>     >         This KIP would definitely address a gap in the current
>     > functionality, where you currently can't have a tombstone with any
>     > associated content.
>     >
>     >         That said, I'd like to talk about use cases, to make sure
> that
>     > this is in fact useful. The KIP should be updated with whatever use
> cases
>     > we come up with.
>     >
>     >         First of all, an observation: When we speak about log
> compaction,
>     > we typically think of "the latest message for a key is retained". In
> that
>     > respect, a delete tombstone (i.e. a message with a null payload) is
> treated
>     > the same as any other Kafka message: the latest message is retained.
> It
>     > doesn't matter whether the latest message is null, or if the latest
> message
>     > has actual content. In all cases, the last message is retained.
>     >
>     >         The only way a delete tombstone is treated differently from
> other
>     > Kafka messages is that it automatically disappears after a while.
> The time
>     > of deletion is specified using delete.retention.ms.
>     >
>     >         So what we're really talking about is, do we want to support
>     > messages in a log-compacted topic that auto-delete themselves after
> a while?
>     >
>     >         In a thread from 2015, there was a discussion on first-class
>     > support of headers between Roger Hoover, Felix GV, Jun Rao, and I.
> See
>     > thread at https://groups.google.com/d/msg/confluent-platform/
>     > 8xPbjyUE_7E/yQ1AeCufL_gJ <https://groups.google.com/d/
>     > msg/confluent-platform/8xPbjyUE_7E/yQ1AeCufL_gJ> . In that thread,
> Jun
>     > raised a good question that I didn't have a good answer for at the
> time: If
>     > a message is going to auto-delete itself after a while, how
> important was
>     > the message? That is, what information did the message contain that
> was
>     > important *for a while* but not so important that it needed to be
> kept
>     > around forever?
>     >
>     >         Some use cases that I can think of:
>     >
>     >         1) Tracability. I would like to know who issued this delete
>     > tombstone. It might include the hostname, IP of the producer of the
> delete.
>     >         2) Timestamps. I would like to know when this delete was
> issued.
>     > This use case is already addressed by the availability of per-message
>     > timestamps that came in 0.10.0
>     >         3) Data provenance. I hope I'm using this phrase correctly,
> but
>     > what I mean is, where did this delete come from? What processing job
>     > emitted it? What input to the processing job caused this delete to be
>     > produced? For example, if a record in topic A was processed and
> caused a
>     > delete tombstone to be emitted to topic B, I might like the offset
> of the
>     > topic A message to be attached to the topic B message.
>     >         4) Distributed tracing for stream topologies. This might be a
>     > slight repeat of the above use cases. In the microservices world, we
> can
>     > generate call-graphs of webservices using tools like Zipkin/
> opentracing.io
>     > <http://opentracing.io/>, or something homegrown like
>     > https://engineering.linkedin.com/distributed-service-call-
>     > graph/real-time-distributed-tracing-website-performance-and-efficiency
> <
>     > https://engineering.linkedin.com/distributed-service-call-
>     > graph/real-time-distributed-tracing-website-performance-
> and-efficiency>.
>     > I can imagine that you might want to do something similar for stream
>     > processing topologies, where stream processing jobs carry along and
> forward
>     > along a globally unique identifier, and a distributed topology graph
> is
>     > generated.
>     >         5) Cases where processing a delete requires data that is not
>     > available in the message key. I'm not sure I have a good example of
> this,
>     > though. One hand-wavy example might be where I am publishing
> documents into
>     > Kafka where the documentId is the message key, and the text contents
> of the
>     > document are in the message body. And I have a consuming job that
> does some
>     > analytics on the message body. If that document gets deleted, then
> the
>     > consuming job might need the original message body in order to
> "delete"
>     > that message's impact from the analytics. But I'm not sure that is a
> great
>     > example. If the consumer was worried about that, the consumer would
>     > probably keep the original message around, stored by primary key.
> And then
>     > all it would need from a delete message would be the primary key of
> the
>     > message.
>     >
>     >         Do people think these are valid use cases?
>     >
>     >         What are other use cases that people can think of?
>     >
>     >         -James
>     >
>     >         > On Oct 26, 2016, at 3:46 PM, Mayuresh Gharat <
>     > gharatmayuresh15@gmail.com> wrote:
>     >         >
>     >         > +1 @Joel.
>     >         > I think a clear migration plan of upgrading and
> downgrading of
>     > server and
>     >         > clients along with handling of issues that Joel mentioned,
> on
>     > the KIP would
>     >         > be really great.
>     >         >
>     >         > Thanks,
>     >         >
>     >         > Mayuresh
>     >         >
>     >         > On Wed, Oct 26, 2016 at 3:31 PM, Joel Koshy <
> jjkoshy.w@gmail.com>
>     > wrote:
>     >         >
>     >         >> I'm not sure why it would be useful, but it should be
>     > theoretically
>     >         >> possible if the attribute bit alone is enough to mark a
>     > tombstone. OTOH, we
>     >         >> could consider that as invalid if we wish. These are
> relevant
>     > details that
>     >         >> I think should be added to the KIP.
>     >         >>
>     >         >> Also, in the few odd scenarios that I mentioned we should
> also
>     > consider
>     >         >> that fetches could be coming from other yet-to-be-upgraded
>     > brokers in a
>     >         >> cluster that is being upgraded. So we would probably want
> to
>     > continue to
>     >         >> support nulls as tombstones or down-convert in a way that
> we
>     > are sure works
>     >         >> with least surprise to fetchers.
>     >         >>
>     >         >> There is a slightly vague statement under "Compatibility,
>     > Deprecation, and
>     >         >> Migration Plan" that could benefit more details: *Logic
> would
>     > base on
>     >         >> current behavior of null value or if tombstone flag set to
>     > true, as such
>     >         >> wouldn't impact any existing flows simply allow new
> producers
>     > to make use
>     >         >> of the feature*. It is unclear to me based on that
> whether you
>     > would
>     >         >> interpret null as a tombstone if the tombstone attribute
> bit is
>     > off.
>     >         >>
>     >         >> On Wed, Oct 26, 2016 at 3:10 PM, Xavier Léauté <
>     > xavier@confluent.io>
>     >         >> wrote:
>     >         >>
>     >         >>> Does this mean that starting with V4 requests we would
> allow
>     > storing null
>     >         >>> messages in compacted topics? The KIP should probably
> clarify
>     > the
>     >         >> behavior
>     >         >>> for null messages where the tombstone flag is not net.
>     >         >>>
>     >         >>> On Wed, Oct 26, 2016 at 1:32 AM Magnus Edenhill <
>     > magnus@edenhill.se>
>     >         >>> wrote:
>     >         >>>
>     >         >>>> 2016-10-25 21:36 GMT+02:00 Nacho Solis
>     > <nsolis@linkedin.com.invalid>:
>     >         >>>>
>     >         >>>>> I think you probably require a MagicByte bump if
you
> expect
>     > correct
>     >         >>>>> behavior of the system as a whole.
>     >         >>>>>
>     >         >>>>> From a client perspective you want to make sure
that
> when you
>     >         >> deliver a
>     >         >>>>> message that the broker supports the feature you're
> expecting
>     >         >>>>> (compaction).  So, depending on the behavior of
the
> broker on
>     >         >>>> encountering
>     >         >>>>> a previously undefined bit flag I would suggest
making
> some
>     > change to
>     >         >>>> make
>     >         >>>>> certain that flag-based compaction is supported.
 I'm
> going
>     > to guess
>     >         >>> that
>     >         >>>>> the MagicByte would do this.
>     >         >>>>>
>     >         >>>>
>     >         >>>> I dont believe this is needed since it is already
> attributed
>     > through
>     >         >> the
>     >         >>>> request's API version.
>     >         >>>>
>     >         >>>> Producer:
>     >         >>>> * if a client sends ProduceRequest V4 then
> attributes.bit5
>     > indicates a
>     >         >>>> tombstone
>     >         >>>> * if a clients sends ProduceRequest <V4 then
> attributes.bit5
>     > is
>     >         >> ignored
>     >         >>>> and value==null indicates a tombstone
>     >         >>>> * in both cases the on-disk messages are stored with
>     > attributes.bit5
>     >         >> (I
>     >         >>>> assume?)
>     >         >>>>
>     >         >>>> Consumer:
>     >         >>>> * if a clients sends FetchRequest V4 messages are
>     > sendfile():ed
>     >         >> directly
>     >         >>>> from disk (with attributes.bit5)
>     >         >>>> * if a client sends FetchRequest <V4 messages are
> slowpathed
>     > and
>     >         >>>> translated from attributes.bit5 to value=null as
> required.
>     >         >>>>
>     >         >>>>
>     >         >>>> That's my understanding anyway, please correct me if
I'm
>     > wrong.
>     >         >>>>
>     >         >>>> /Magnus
>     >         >>>>
>     >         >>>>
>     >         >>>>
>     >         >>>>> On Tue, Oct 25, 2016 at 10:17 AM, Magnus Edenhill
<
>     >         >> magnus@edenhill.se>
>     >         >>>>> wrote:
>     >         >>>>>
>     >         >>>>>> It is safe to assume that a previously undefined
> attributes
>     > bit
>     >         >> will
>     >         >>> be
>     >         >>>>>> unset in protocol requests from existing clients,
if
> not,
>     > such a
>     >         >>> client
>     >         >>>>> is
>     >         >>>>>> already violating the protocol and needs to
be fixed.
>     >         >>>>>>
>     >         >>>>>> So I dont see a need for a MagicByte bump,
both
> broker and
>     > client
>     >         >> has
>     >         >>>> the
>     >         >>>>>> information it needs to construct or parse
the message
>     > according to
>     >         >>>>> request
>     >         >>>>>> version.
>     >         >>>>>>
>     >         >>>>>>
>     >         >>>>>> 2016-10-25 18:48 GMT+02:00 Michael Pearce <
>     > Michael.Pearce@ig.com>:
>     >         >>>>>>
>     >         >>>>>>> Hi Magnus,
>     >         >>>>>>>
>     >         >>>>>>> I was wondering if I even needed to change
those
> also, as
>     >         >>> technically
>     >         >>>>>>> we’re just making use of a non used attribute
bit,
> but im
>     > not
>     >         >> 100%
>     >         >>>> that
>     >         >>>>>> it
>     >         >>>>>>> be always false currently.
>     >         >>>>>>>
>     >         >>>>>>> If someone can say 100% it will already
be set false
> with
>     > current
>     >         >>> and
>     >         >>>>>>> historic bit wise masking techniques used
over the
> time,
>     > we could
>     >         >>> do
>     >         >>>>> away
>     >         >>>>>>> with both, and simply just start to use
it.
> Unfortunately
>     > I don’t
>     >         >>>> have
>     >         >>>>>> that
>     >         >>>>>>> historic knowledge so was hoping it would
be flagged
> up in
>     > this
>     >         >>>>>> discussion
>     >         >>>>>>> thread ☺
>     >         >>>>>>>
>     >         >>>>>>> Cheers
>     >         >>>>>>> Mike
>     >         >>>>>>>
>     >         >>>>>>> On 10/25/16, 5:36 PM, "Magnus Edenhill"
<
>     > magnus@edenhill.se>
>     >         >>> wrote:
>     >         >>>>>>>
>     >         >>>>>>>    Hi Michael,
>     >         >>>>>>>
>     >         >>>>>>>    With the version bumps for Produce and
Fetch
> requests,
>     > do you
>     >         >>>>> really
>     >         >>>>>>> need
>     >         >>>>>>>    to bump MagicByte too?
>     >         >>>>>>>
>     >         >>>>>>>    Regards,
>     >         >>>>>>>    Magnus
>     >         >>>>>>>
>     >         >>>>>>>
>     >         >>>>>>>    2016-10-25 18:09 GMT+02:00 Michael Pearce
<
>     >         >>> Michael.Pearce@ig.com
>     >         >>>>> :
>     >         >>>>>>>
>     >         >>>>>>>> Hi All,
>     >         >>>>>>>>
>     >         >>>>>>>> I would like to discuss the following
KIP proposal:
>     >         >>>>>>>> https://cwiki.apache.org/
> confluence/display/KAFKA/KIP-
>     >         >>>>>>>> 87+-+Add+Compaction+Tombstone+Flag
>     >         >>>>>>>>
>     >         >>>>>>>> This is off the back of the discussion
on KIP-82  /
> KIP
>     >         >>> meeting
>     >         >>>>>>> where it
>     >         >>>>>>>> was agreed to separate this issue and
feature. See:
>     >         >>>>>>>> http://mail-archives.apache.
> org/mod_mbox/kafka-dev/201610
>     > .
>     >         >>>>>>>> mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
>     >         >>>>>>>> EZsBsNyDcKr=g@mail.gmail.com%3e
>     >         >>>>>>>>
>     >         >>>>>>>> Thanks
>     >         >>>>>>>> Mike
>     >         >>>>>>>>
>     >         >>>>>>>> The information contained in this email
is strictly
>     >         >>>> confidential
>     >         >>>>>> and
>     >         >>>>>>> for
>     >         >>>>>>>> the use of the addressee only, unless
otherwise
> indicated.
>     >         >> If
>     >         >>>> you
>     >         >>>>>>> are not
>     >         >>>>>>>> the intended recipient, please do not
read, copy,
> use or
>     >         >>>> disclose
>     >         >>>>>> to
>     >         >>>>>>> others
>     >         >>>>>>>> this message or any attachment. Please
also notify
> the
>     >         >> sender
>     >         >>>> by
>     >         >>>>>>> replying
>     >         >>>>>>>> to this email or by telephone (+44(020
7896 0011)
> and then
>     >         >>>> delete
>     >         >>>>>>> the email
>     >         >>>>>>>> and any copies of it. Opinions, conclusion
(etc)
> that do
>     >         >> not
>     >         >>>>> relate
>     >         >>>>>>> to the
>     >         >>>>>>>> official business of this company shall
be
> understood as
>     >         >>>> neither
>     >         >>>>>>> given nor
>     >         >>>>>>>> endorsed by it. IG is a trading name
of IG Markets
> Limited
>     >         >> (a
>     >         >>>>>> company
>     >         >>>>>>>> registered in England and Wales, company
number
> 04008957)
>     >         >> and
>     >         >>>> IG
>     >         >>>>>>> Index
>     >         >>>>>>>> Limited (a company registered in England
and Wales,
>     > company
>     >         >>>>> number
>     >         >>>>>>>> 01190902). Registered address at Cannon
Bridge
> House, 25
>     >         >>>> Dowgate
>     >         >>>>>>> Hill,
>     >         >>>>>>>> London EC4R 2YA. Both IG Markets Limited
(register
> number
>     >         >>>> 195355)
>     >         >>>>>>> and IG
>     >         >>>>>>>> Index Limited (register number 114059)
are
> authorised and
>     >         >>>>> regulated
>     >         >>>>>>> by the
>     >         >>>>>>>> Financial Conduct Authority.
>     >         >>>>>>>>
>     >         >>>>>>>
>     >         >>>>>>>
>     >         >>>>>>> The information contained in this email
is strictly
>     > confidential
>     >         >>> and
>     >         >>>>> for
>     >         >>>>>>> the use of the addressee only, unless otherwise
> indicated.
>     > If you
>     >         >>> are
>     >         >>>>> not
>     >         >>>>>>> the intended recipient, please do not read,
copy,
> use or
>     > disclose
>     >         >>> to
>     >         >>>>>> others
>     >         >>>>>>> this message or any attachment. Please
also notify
> the
>     > sender by
>     >         >>>>> replying
>     >         >>>>>>> to this email or by telephone (+44(020
7896 0011)
> and then
>     > delete
>     >         >>> the
>     >         >>>>>> email
>     >         >>>>>>> and any copies of it. Opinions, conclusion
(etc)
> that do
>     > not
>     >         >> relate
>     >         >>>> to
>     >         >>>>>> the
>     >         >>>>>>> official business of this company shall
be
> understood as
>     > neither
>     >         >>>> given
>     >         >>>>>> nor
>     >         >>>>>>> endorsed by it. IG is a trading name of
IG Markets
> Limited
>     > (a
>     >         >>> company
>     >         >>>>>>> registered in England and Wales, company
number
> 04008957)
>     > and IG
>     >         >>>> Index
>     >         >>>>>>> Limited (a company registered in England
and Wales,
> company
>     >         >> number
>     >         >>>>>>> 01190902). Registered address at Cannon
Bridge
> House, 25
>     > Dowgate
>     >         >>>> Hill,
>     >         >>>>>>> London EC4R 2YA. Both IG Markets Limited
(register
> number
>     > 195355)
>     >         >>> and
>     >         >>>>> IG
>     >         >>>>>>> Index Limited (register number 114059)
are
> authorised and
>     >         >> regulated
>     >         >>>> by
>     >         >>>>>> the
>     >         >>>>>>> Financial Conduct Authority.
>     >         >>>>>>>
>     >         >>>>>>
>     >         >>>>>
>     >         >>>>>
>     >         >>>>>
>     >         >>>>> --
>     >         >>>>> Nacho (Ignacio) Solis
>     >         >>>>> Kafka
>     >         >>>>> nsolis@linkedin.com
>     >         >>>>>
>     >         >>>>
>     >         >>>
>     >         >>
>     >         >
>     >         >
>     >         >
>     >         > --
>     >         > -Regards,
>     >         > Mayuresh R. Gharat
>     >         > (862) 250-7125
>     >
>     >
>     >
>     >     The information contained in this email is strictly confidential
> and
>     > for the use of the addressee only, unless otherwise indicated. If
> you are
>     > not the intended recipient, please do not read, copy, use or
> disclose to
>     > others this message or any attachment. Please also notify the sender
> by
>     > replying to this email or by telephone (+44(020 7896 0011) and then
> delete
>     > the email and any copies of it. Opinions, conclusion (etc) that do
> not
>     > relate to the official business of this company shall be understood
> as
>     > neither given nor endorsed by it. IG is a trading name of IG Markets
>     > Limited (a company registered in England and Wales, company number
>     > 04008957) and IG Index Limited (a company registered in England and
> Wales,
>     > company number 01190902). Registered address at Cannon Bridge House,
> 25
>     > Dowgate Hill, London EC4R 2YA. Both IG Markets Limited (register
> number
>     > 195355) and IG Index Limited (register number 114059) are authorised
> and
>     > regulated by the Financial Conduct Authority.
>     >
>     >
>     >
>
>
> The information contained in this email is strictly confidential and for
> the use of the addressee only, unless otherwise indicated. If you are not
> the intended recipient, please do not read, copy, use or disclose to others
> this message or any attachment. Please also notify the sender by replying
> to this email or by telephone (+44(020 7896 0011) and then delete the email
> and any copies of it. Opinions, conclusion (etc) that do not relate to the
> official business of this company shall be understood as neither given nor
> endorsed by it. IG is a trading name of IG Markets Limited (a company
> registered in England and Wales, company number 04008957) and IG Index
> Limited (a company registered in England and Wales, company number
> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> Index Limited (register number 114059) are authorised and regulated by the
> Financial Conduct Authority.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message