kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guozhang Wang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (KAFKA-3554) Generate actual data with specific compression ratio and add multi-thread support in the ProducerPerformance tool.
Date Sat, 23 Sep 2017 04:49:04 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Guozhang Wang updated KAFKA-3554:
---------------------------------

*Reminder to the contributor / reviewer of the PR*: please note that the code deadline for
1.0.0 is less than 2 weeks away (Oct. 4th). Please re-evaluate your JIRA and see if it still
makes sense to be merged into 1.0.0 or it could be pushed out to 1.1.0, or be closed directly
if the JIRA itself is not valid any more, or re-assign yourself as contributor / committer
if you are no longer working on the JIRA.

> Generate actual data with specific compression ratio and add multi-thread support in
the ProducerPerformance tool.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-3554
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3554
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions: 0.9.0.1
>            Reporter: Jiangjie Qin
>            Assignee: Jiangjie Qin
>             Fix For: 1.0.0
>
>
> Currently the ProducerPerformance always generate the payload with same bytes. This does
not quite well to test the compressed data because the payload is extremely compressible no
matter how big the payload is.
> We can make some changes to make it more useful for compressed messages. Currently I
am generating the payload containing integer from a given range. By adjusting the range of
the integers, we can get different compression ratios. 
> API wise, we can either let user to specify the integer range or the expected compression
ratio (we will do some probing to get the corresponding range for the users)
> Besides that, in many cases, it is useful to have multiple producer threads when the
producer threads themselves are bottleneck. Admittedly people can run multiple ProducerPerformance
to achieve similar result, but it is still different from the real case when people actually
use the producer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message