From dev-return-93644-archive-asf-public=cust-asf.ponee.io@kafka.apache.org Fri Apr 27 11:11:56 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 8D166180608 for ; Fri, 27 Apr 2018 11:11:55 +0200 (CEST) Received: (qmail 86064 invoked by uid 500); 27 Apr 2018 09:11:49 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 86047 invoked by uid 99); 27 Apr 2018 09:11:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Apr 2018 09:11:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 04F571A06F0 for ; Fri, 27 Apr 2018 09:11:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.889 X-Spam-Level: * X-Spam-Status: No, score=1.889 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=yahoo.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 84GOYbo8IltF for ; Fri, 27 Apr 2018 09:11:46 +0000 (UTC) Received: from sonic311-21.consmr.mail.ne1.yahoo.com (sonic311-21.consmr.mail.ne1.yahoo.com [66.163.188.202]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 6B5C65FBAF for ; Fri, 27 Apr 2018 09:11:46 +0000 (UTC) X-YMail-OSG: Rk8DXLMVM1n3TnIfRcqgoKmSczrtVfXUwK.hraJ0hTP2lp61C_You03e3yyP3Iy d_COyXO9IC9roIMUurm8WhRIMtf76E_4XbnT9H5tPBonah4.PgFb5y2FbaB56JELVeNAY9OjW6wK rrykUnmqq0TUxd2LDFQr2YJMr_DcrD5ScL7RCU0QJH2KihFL7HNtKVAo04o1R7teUMjuJzbKl04l 25ooozqMd5kuJWY8mKI3yzbUqSRDag3cZD4ycWsPpgOhnlrHcapJPVfG625.QQExyBUHKJbY6he8 SPcMV_0glyYW4CqM_oYOah5UBStf2QeZ7WJhaKIFI1nnaqdhgv6Sp.GfvFN.ug704iUG5s6PX6s7 AZraW3oN554vXSDrZhvnZ3Osb7wz8xrbOzJ9aWb5xuUwAohpkaAuNsssJq9moJrgHTS1Xbm7DYSY ivOGWWWHWhXw16iBBs3INEv2EPoYMS6kX.5s2ltjcCeeByxlhQRn5bVKHjk1qxP1.TL9npxt8iov YNE5DqaVRZQRVN9HAtFbkzb2j8PwZzoEb9lBk07vyxTqCbzwm0Z2nxB3ckvPqhIo4ygFr06zD4Ea h9xoTGEfrM3DL Received: from sonic.gate.mail.ne1.yahoo.com by sonic311.consmr.mail.ne1.yahoo.com with HTTP; Fri, 27 Apr 2018 09:11:39 +0000 Date: Fri, 27 Apr 2018 09:11:36 +0000 (UTC) From: =?UTF-8?Q?Lu=C3=ADs_Cabral?= To: dev@kafka.apache.org Message-ID: <118832288.1472020.1524820296025@mail.yahoo.com> In-Reply-To: References: <2063622216.1377682.1523958070807@mail.yahoo.com> <5ad5e19c.1c69fb81.f344c.b745@mx.google.com> <1830856732.1438196.1523969056243@mail.yahoo.com> <1639951266.3041605.1524211556065@mail.yahoo.com> <1939671760.105674.1524475809032@mail.yahoo.com> <9c803e82-018c-4941-84e5-39c8033c1a53@CO1NAM05FT003.eop-nam05.prod.protection.outlook.com> <990610900.159449.1524555685143@mail.yahoo.com> <32319B29-AE99-4395-A870-346BBC4B28D6@yahoo.com> <818570469.1055639.1524732268255@mail.yahoo.com> Subject: Re: [DISCUSS] KIP-280: Enhanced log compaction MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_1472019_190406920.1524820296022" X-Mailer: WebService/1.1.11819 YMailNorrin Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.117 Safari/537.36 ------=_Part_1472019_190406920.1524820296022 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, The KIP is now updated with the results of the byte array discussion. This is my first contribution to Kafka, so I'm not sure on what the process= es are.=C2=A0Is it now acceptable to take this into a vote, or should I ask= for more contributors to join the discussion first? Kind Regards,Luis On Friday, April 27, 2018, 6:12:03 AM GMT+2, Guozhang = Wang wrote: =20 =20 Hello Lu=C3=ADs, > Offset is an integer? I've only noticed it being resolved as a long so far. You are right, offset is a long. As for timestamp / other types, I left a comment in your PR about handling tie breakers. > Given these arguments, is this point something that you absolutely must have? No I do not have a strong use case in mind to go with arbitrary byte arrays, was just thinking that if we are going to enhance log compaction why not generalize it more :) Your concern about the memory usage makes sense. I'm happy to take my suggestion back and enforce only long typed fields. Guozhang On Thu, Apr 26, 2018 at 1:44 AM, Lu=C3=ADs Cabral wrote: >=C2=A0 Hi, > > bq. have a integer typed OffsetMap (for offset) > > Offset is an integer? I've only noticed it being resolved as a long so fa= r. > > > bq. long typed OffsetMap (for timestamp) > > We would still need to store the offset, as it is functioning as a > tie-breaker. Not that this is a big deal, we can be easily have both (as > currently done on the PR). > > > bq. For the byte array typed offset map, we can use a general hashmap, > where the hashmap's CAPACITY will be reasoned from the given "val memory: > Int" parameter > > If you have a map with 128 byte capacity, then store a value with 16 byte= s > and another with 32 bytes, how many free slots do you have left in this m= ap? > > You can make this work, but I think you would need to re-design the whole > log cleaner approach, which implies changing some of the already existing > configurations (like "log.cleaner.io.buffer.load.factor"). I would rather > maintain backwards compatibility as much as possible in this KIP, and if > this means that using "foo" / "bar" or "2.1-a" / "3.20-b" as record > versions is not viable, then so be it. > > Given these arguments, is this point something that you absolutely must > have? I'm still sort of hoping that you are just entertaining the idea an= d > are ok with having a long (now conceded to be unsigned, so the byte array= s > can be compared directly). > > > Kind Regards,Luis > --=20 -- Guozhang =20 ------=_Part_1472019_190406920.1524820296022--