kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Becket Qin <becket....@gmail.com>
Subject Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size
Date Wed, 22 Feb 2017 02:41:09 GMT
Hi Apurva,

Yes, it is true that the request size might be much smaller if the batching
is based on uncompressed size. I will let the users know about this. That
said, in practice, this is probably fine. For example, at LinkedIn, our max
message size is 1 MB, typically the compressed size would be 100 KB or
larger, given that in most cases, there are many partitions, the request
size would not be too small (typically around a few MB).

At LinkedIn we do have some topics has various compression ratio. Those are
usually topics shared by different services so the data may differ a lot
although they are in the same topic and similar fields.


Jiangjie (Becket) Qin

On Tue, Feb 21, 2017 at 6:17 PM, Apurva Mehta <apurva@confluent.io> wrote:

> Hi Becket, Thanks for the kip.
> I think one of the risks here is that when compression estimation is
> disabled, you could have much smaller batches than expected, and throughput
> could be hurt. It would be worth adding this to the documentation of this
> setting.
> Also, one of the rejected alternatives states that per topic estimations
> would not work when the compression of individual messages is variable.
> This is true in theory, but in practice one would expect Kafka topics to
> have fairly homogenous data, and hence should compress evenly. I was
> curious if you have data which shows otherwise.
> Thanks,
> Apurva
> On Tue, Feb 21, 2017 at 12:30 PM, Becket Qin <becket.qin@gmail.com> wrote:
> > Hi folks,
> >
> > I would like to start the discussion thread on KIP-126. The KIP propose
> > adding a new configuration to KafkaProducer to allow batching based on
> > uncompressed message size.
> >
> > Comments are welcome.
> >
> > The KIP wiki is following:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 126+-+Allow+KafkaProducer+to+batch+based+on+uncompressed+size
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message