kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luís Cabral <luis_cab...@yahoo.com.INVALID>
Subject Re: [DISCUSS] KIP-280: Enhanced log compaction
Date Fri, 27 Apr 2018 09:11:36 GMT
 Hi,

The KIP is now updated with the results of the byte array discussion.

This is my first contribution to Kafka, so I'm not sure on what the processes are. Is it
now acceptable to take this into a vote, or should I ask for more contributors to join the
discussion first?

Kind Regards,Luis    On Friday, April 27, 2018, 6:12:03 AM GMT+2, Guozhang Wang <wangguoz@gmail.com>
wrote:  
 
 Hello Luís,

> Offset is an integer? I've only noticed it being resolved as a long so
far.

You are right, offset is a long.

As for timestamp / other types, I left a comment in your PR about handling
tie breakers.

> Given these arguments, is this point something that you absolutely must
have?

No I do not have a strong use case in mind to go with arbitrary byte
arrays, was just thinking that if we are going to enhance log compaction
why not generalize it more :)

Your concern about the memory usage makes sense. I'm happy to take my
suggestion back and enforce only long typed fields.


Guozhang





On Thu, Apr 26, 2018 at 1:44 AM, Luís Cabral <luis_cabral@yahoo.com.invalid>
wrote:

>  Hi,
>
> bq. have a integer typed OffsetMap (for offset)
>
> Offset is an integer? I've only noticed it being resolved as a long so far.
>
>
> bq. long typed OffsetMap (for timestamp)
>
> We would still need to store the offset, as it is functioning as a
> tie-breaker. Not that this is a big deal, we can be easily have both (as
> currently done on the PR).
>
>
> bq. For the byte array typed offset map, we can use a general hashmap,
> where the hashmap's CAPACITY will be reasoned from the given "val memory:
> Int" parameter
>
> If you have a map with 128 byte capacity, then store a value with 16 bytes
> and another with 32 bytes, how many free slots do you have left in this map?
>
> You can make this work, but I think you would need to re-design the whole
> log cleaner approach, which implies changing some of the already existing
> configurations (like "log.cleaner.io.buffer.load.factor"). I would rather
> maintain backwards compatibility as much as possible in this KIP, and if
> this means that using "foo" / "bar" or "2.1-a" / "3.20-b" as record
> versions is not viable, then so be it.
>
> Given these arguments, is this point something that you absolutely must
> have? I'm still sort of hoping that you are just entertaining the idea and
> are ok with having a long (now conceded to be unsigned, so the byte arrays
> can be compared directly).
>
>
> Kind Regards,Luis
>



-- 
-- Guozhang  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message