kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Becket Qin <becket....@gmail.com>
Subject Re: [VOTE] KIP-32 Add CreateTime and LogAppendTime to Kafka message.
Date Wed, 06 Jan 2016 16:09:38 GMT
Hey Jay,

I agree with your comments on the KIP page. 
As you said it is important to document what are the changes and why we made those decisions.
I have added summaries to the page to help readers understand the decision we made more easily
and also try to avoid being too verbose. I also clarified the scope of KIP-32 and KIP-33.
Please let me know if you think we should make further improvements.

Thanks,

Jiangjie (Becket) Qin


> On Jan 6, 2016, at 7:21 AM, Jay Kreps <jay@confluent.io> wrote:
> 
> I'm +1 on the design, but this proposal doesn't look like it is quite
> finished. We are trying to make the KIPs really be a record of design
> decisions we made and why. We should be able to give the document to
> advanced Kafka users and have them walk away with a good idea of what
> we are changing and why. I think having this kind of record of why
> changes were made is really valuable especially given the amount of
> times these things come back up (I'm quite sure this saves time in the
> long run).
> 
> If people think this is a waste of time, let's revisit, but if not
> could we try to bring this up to the level of quality of the other
> KIPs?
> 
> In particular, a few things:
> 
> Overall this reads more like implementation notes I might make for
> myself rather than a design doc.
> 
> This doesn't really motivate the idea of creation vs append time or
> when each would be used or why you would want both.
> 
> The public interfaces section says "There are a few options to achieve
> the goals. Each has their own pros and cons. Please see the details
> below." This might have been true in the first version as we were
> discussing but now we have one proposal, right? We should list out the
> new public interfaces (binary format, java api, configs). This is a
> way of calling out that "the user gets this new stuff".
> 
> This still has a number of details that I think we've now moved to
> KIP-33. i.e. the bits about indexing are now part of that?
> 
> Cheers,
> 
> -Jay
> 
>> On Tue, Jan 5, 2016 at 11:47 AM, Neha Narkhede <neha@confluent.io> wrote:
>> +1 on the KIP. Thanks for the hard work, Becket!
>> 
>>> On Tue, Jan 5, 2016 at 11:24 AM, Jun Rao <jun@confluent.io> wrote:
>>> 
>>> Jiangjie,
>>> 
>>> Thanks for the updated KIP. Overall, a +1 on the proposal. A few minor
>>> comments on the KIP.
>>> 
>>> KIP-32:
>>> 50. 6.c says "The log rolling has to depend on the earliest timestamp",
>>> which is inconsistent with KIP-33.
>>> 
>>> 51. 8.b "If the time difference threshold is set to 0. The timestamp in the
>>> message is equivalent to LogAppendTime." If the time difference is 0 and
>>> CreateTime is used, all messages will likely be rejected in this proposal.
>>> So, it's not equivalent to LogAppendTime.
>>> 
>>> 52. Could you include the new value of magic byte in message format change?
>>> Also, do we have a single new message format that includes both the offset
>>> change (relative offset for inner messages) and the addition of timestamp?
>>> 
>>> 53. Could you document the changes in ProducerRequest V2 and FetchRequest
>>> V2 (and the responses)?
>>> 
>>> 54. In migration phase 1, step 2, does internal ApiVersion mean
>>> inter.broker.protocol.version?
>>> 
>>> 55. In canary step 2.b, it says "It will only see
>>> ProduceRequest/FetchRequest V1 from other brokers and clietns.". But in
>>> phase 2, a broker will receive FetchRequest V2 from other brokers.
>>> 
>>> 
>>> KIP-33:
>>> 60. The KIP still says maintaining index at "at minute granularity" even
>>> though the index interval is configurable now.
>>> 
>>> 61. In this design, it's possible for a log segment to have an empty time
>>> index. In the worse case, we may have to scan more than the active segment
>>> to recover the latest timestamp.
>>> 
>>> Thanks,
>>> 
>>> Jun
>>> 
>>> On Mon, Jan 4, 2016 at 11:37 AM, Aditya Auradkar <
>>> aauradkar@linkedin.com.invalid> wrote:
>>> 
>>>> Hey Becket/Anna -
>>>> 
>>>> I have a few comments about the KIP.
>>>> 
>>>> 1. (Minor) Can we rename the KIP? It's currently "Add CreateTime and
>>>> LogAppendTime etc..". This is actually the title of the now rejected
>>> Option
>>>> 1.
>>>> 2. (Minor) Can we rename the proposed option? It isn't really "option 4"
>>>> anymore.
>>>> 3. I'm not clear on what exactly happens to compressed messages
>>>> when message.timestamp.type=LogAppendTime? Does every batch get
>>>> recompressed because the inner message gets rewritten with the server
>>>> timestamp? Or does the message set on disk have the timestamp set to -1.
>>> In
>>>> that case, what do we use as timestamp for the message?
>>>> 4. Do message.timestamp.type and max.message.time.difference.ms need to
>>> be
>>>> per-topic configs? It seems that this is really a client config i.e. a
>>>> client is the source of timestamps not a topic. It could also be a
>>>> broker-level config to keep things simple.
>>>> 5. The "Proposed Changes" section in the KIP tries to build a time-based
>>>> index for query but that is a separate proposal (KIP-33). Can we more
>>>> crisply identify what exactly will change when this KIP (and 31) is
>>>> implemented? It isn't super clear to me at this point.
>>>> 
>>>> Aside from that, I think the "Rejected Alternatives" section of the KIP
>>> is
>>>> excellent. Very good insight into what options were discussed and
>>> rejected.
>>>> 
>>>> Aditya
>>>> 
>>>>> On Mon, Dec 28, 2015 at 3:57 PM, Becket Qin <becket.qin@gmail.com>
>>>> wrote:
>>>> 
>>>>> Thanks Guozhang, Gwen and Neha for the comments. Sorry for late reply
>>>>> because I only have occasional gmail access from my phone...
>>>>> 
>>>>> I just updated the wiki for KIP-32.
>>>>> 
>>>>> Gwen,
>>>>> 
>>>>> Yes, the migration plan is what you described.
>>>>> 
>>>>> I agree with your comments on the version.
>>>>> I changed message.format.version to use the release version.
>>>>> I did not change the internal version, we can discuss this in a
>>> separate
>>>>> thread.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Jiangjie (Becket) Qin
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Dec 24, 2015, at 5:38 AM, Guozhang Wang <wangguoz@gmail.com>
>>> wrote:
>>>>>> 
>>>>>> Also I agree with Gwen that such changes may worth a 0.10 release
or
>>>> even
>>>>>> 1.0, having it in 0.9.1 would be quite confusing to users.
>>>>>> 
>>>>>> Guozhang
>>>>>> 
>>>>>>> On Wed, Dec 23, 2015 at 1:36 PM, Guozhang Wang <wangguoz@gmail.com>
>>>>> wrote:
>>>>>>> 
>>>>>>> Becket,
>>>>>>> 
>>>>>>> Please let us know once you have updated the wiki page regarding
the
>>>>>>> migration plan. Thanks!
>>>>>>> 
>>>>>>> Guozhang
>>>>>>> 
>>>>>>>> On Wed, Dec 23, 2015 at 11:52 AM, Gwen Shapira <gwen@confluent.io>
>>>>> wrote:
>>>>>>>> 
>>>>>>>> Thanks Becket, Anne and Neha for responding to my concern.
>>>>>>>> 
>>>>>>>> I had an offline discussion with Anne where she helped me
>>> understand
>>>>> the
>>>>>>>> migration process. It isn't as bad as it looks in the KIP
:)
>>>>>>>> 
>>>>>>>> If I understand it correctly, the process (for users) will
be:
>>>>>>>> 
>>>>>>>> 1. Prepare for upgrade (set format.version = 0, ApiVersion
= 0.9.0)
>>>>>>>> 2. Rolling upgrade of brokers
>>>>>>>> 3. Bump ApiVersion to 0.9.0-1, so fetch requests between
brokers
>>> will
>>>>> use
>>>>>>>> the new protocol
>>>>>>>> 4. Start upgrading clients
>>>>>>>> 5. When "enough" clients are upgraded, bump format.version
to 1
>>>>> (rolling).
>>>>>>>> 
>>>>>>>> Becket, can you confirm?
>>>>>>>> 
>>>>>>>> Assuming this is the process, I'm +1 on the change.
>>>>>>>> 
>>>>>>>> Reminder to coders and reviewers that pull-requests with
>>> user-facing
>>>>>>>> changes should include documentation changes as well as code
>>> changes.
>>>>>>>> And a polite request to try to be helpful to users on when
to use
>>>>>>>> create-time and when to use log-append-time as configuration
- this
>>>> is
>>>>> not
>>>>>>>> a trivial decision.
>>>>>>>> 
>>>>>>>> A separate point I'm going to raise in a different thread
is that
>>> we
>>>>> need
>>>>>>>> to streamline our versions a bit:
>>>>>>>> 1. I'm afraid that 0.9.0-1 will be confusing to users who
care
>>> about
>>>>>>>> released versions (what if we forget to change it before
the
>>> release?
>>>>> Is
>>>>>>>> it
>>>>>>>> meaningful enough to someone running off trunk?), we need
to come
>>> up
>>>>> with
>>>>>>>> something that will work for both LinkedIn and everyone else.
>>>>>>>> 2. ApiVersion has real version numbers. message.format.version
has
>>>>>>>> sequence
>>>>>>>> numbers. This makes us look pretty silly :)
>>>>>>>> 
>>>>>>>> My version concerns can be addressed separately and should
not hold
>>>>> back
>>>>>>>> this KIP.
>>>>>>>> 
>>>>>>>> Gwen
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tue, Dec 22, 2015 at 11:01 PM, Becket Qin <becket.qin@gmail.com
>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Anna,
>>>>>>>>> 
>>>>>>>>> Thanks for initiating the voting process. I did not start
the
>>> voting
>>>>>>>>> process because there were still some ongoing discussion
with Jun
>>>>> about
>>>>>>>> the
>>>>>>>>> timestamp regarding compressed messages. That is why
the wiki page
>>>>>>>> hasn't
>>>>>>>>> reflected the latest conversation as Guozhang pointed
out.
>>>>>>>>> 
>>>>>>>>> Like Neha said I think we have reached general agreement
on this
>>>> KIP.
>>>>> So
>>>>>>>>> it is probably fine to start the KIP voting. At least
we draw more
>>>>>>>>> attention to the KIP even if there are some new discussion
to
>>> bring
>>>>> up.
>>>>>>>>> 
>>>>>>>>> Regarding the upgrade plan, given we decided to implement
KIP-31
>>> and
>>>>>>>>> KIP-32 in the same patch to avoid change binary protocol
twice,
>>> the
>>>>>>>> upgrade
>>>>>>>>> plan was mostly discussed on the discussion thread of
KIP-31.
>>>>>>>>> 
>>>>>>>>> Since the voting has been initiated, I will update the
wiki with
>>>>> latest
>>>>>>>>> conversation to avoid further confusion.
>>>>>>>>> 
>>>>>>>>> BTW, I actually have started coding work on KIP-31 and
KIP-32 and
>>>> will
>>>>>>>>> focus on the patch before I return from vacation in mid
Jan
>>> because
>>>> I
>>>>>>>> have
>>>>>>>>> no LInkedIn VPN access in China anyway...
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> 
>>>>>>>>> Jiangjie
>>>>>>>>> 
>>>>>>>>>> On Dec 23, 2015, at 12:31 PM, Anna Povzner <anna@confluent.io>
>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Gwen,
>>>>>>>>>> 
>>>>>>>>>> I just wanted to point out that I just started the
vote. Becket
>>>> wrote
>>>>>>>> the
>>>>>>>>>> proposal and led the discussions.
>>>>>>>>>> 
>>>>>>>>>> What I understood from reading the discussion thread,
the
>>> migration
>>>>>>>> plan
>>>>>>>>>> was discussed at the KIP meeting, and not much on
the mailing
>>> list
>>>>>>>>> itself.
>>>>>>>>>> 
>>>>>>>>>> My question about the migration plan was same as
Guozhang wrote:
>>>> The
>>>>>>>> case
>>>>>>>>>> when an upgraded broker receives an old producer
request. The
>>>>>>>> proposal is
>>>>>>>>>> for the broker to fill in the timestamp field with
the current
>>> time
>>>>> at
>>>>>>>>> the
>>>>>>>>>> broker. Cons: it goes against the definition of CreateTime
type
>>> of
>>>>> the
>>>>>>>>>> timestamp (we are "over-writing" it at the broker).
Pros: It
>>> looks
>>>>>>>> like
>>>>>>>>>> most of the use-cases would actually want that behavior,
because
>>>>>>>>> otherwise
>>>>>>>>>> timestamp is useless and they will need to support
an old,
>>>>>>>> pre-timestamp,
>>>>>>>>>> behavior. E.g., if we modify log retention policy
to use the
>>>>>>>> timestamp,
>>>>>>>>> we
>>>>>>>>>> would need to support an old implementation (the
one that does
>>> not
>>>>> use
>>>>>>>>>> timestamps in the message). So I actually have a
preference for
>>> the
>>>>>>>>>> proposed approach.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Anna
>>>>>>>>>> 
>>>>>>>>>>> On Tue, Dec 22, 2015 at 8:02 PM, Neha Narkhede
<
>>> neha@confluent.io
>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hey Gwen,
>>>>>>>>>>> 
>>>>>>>>>>> Migration plan wasn't really discussed a ton
in the previous
>>>>> threads.
>>>>>>>>> So it
>>>>>>>>>>> will be great to dive deep and see if there are
gaps there. I
>>> had
>>>>>>>> some
>>>>>>>>>>> questions, but the details listed on the KIP
are great.
>>>>>>>>>>> 
>>>>>>>>>>> It is complex, though the plan outlined in the
wiki assumes a
>>> zero
>>>>>>>>> downtime
>>>>>>>>>>> upgrade assuming that producers and consumers
can't be upgraded
>>> in
>>>>>>>>> tandem.
>>>>>>>>>>> This is typical for companies that have a significant
Kafka
>>>>>>>> footprint,
>>>>>>>>> like
>>>>>>>>>>> LinkedIn.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Neha
>>>>>>>>>>> 
>>>>>>>>>>>> On Tue, Dec 22, 2015 at 7:48 PM, Gwen Shapira
<
>>> gwen@confluent.io
>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Hi Anna,
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks for the KIP, especially for the details
on all the
>>>>>>>> alternatives
>>>>>>>>>>> and
>>>>>>>>>>>> how we arrived at the proposal. Its really
great!
>>>>>>>>>>>> 
>>>>>>>>>>>> Can you point me at where the migration plan
was discussed? It
>>>>> looks
>>>>>>>>>>> overly
>>>>>>>>>>>> complex and I have a bunch of questions,
but if there was a
>>>>>>>> discussion,
>>>>>>>>>>> I'd
>>>>>>>>>>>> like to read up rather than repeating it
:)
>>>>>>>>>>>> 
>>>>>>>>>>>> Gwen
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Tue, Dec 22, 2015 at 12:34 PM, Anna
Povzner <
>>>> anna@confluent.io
>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I am opening the voting thread for KIP-32:
Add CreateTime and
>>>>>>>>>>>>> LogAppendTime to Kafka message.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> For reference, here's the KIP wiki:
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-32+-+Add+CreateTime+and+LogAppendTime+to+Kafka+message
>>>>>>>>>>>>> 
>>>>>>>>>>>>> And the mailing list threads:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> September:
>>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201509.mbox/%3CCAHrRUm6NMg%3DPh4HAJdxr%3DpmZhfFcD5OEV2yxj3fg%2BXnEBTW%2B3w%40mail.gmail.com%3E
>>>>>>>>>>>>> 
>>>>>>>>>>>>> October:
>>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201510.mbox/%3CCAHrRUm7RiBAJxwO15s1tztz%3D15oibO-QJ%2B_w8AxafTnuw3jjCw%40mail.gmail.com%3E
>>>>>>>>>>>>> 
>>>>>>>>>>>>> December:
>>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201512.mbox/%3CCAHrRUm4ugxDYzyy26MGRGKpK4hsjT4EKTuu18M3wztYq4PE%3DaQ%40mail.gmail.com%3E
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> Anna
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Neha
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> -- Guozhang
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> -- Guozhang
>> 
>> 
>> 
>> --
>> Thanks,
>> Neha

Mime
View raw message