kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiangjie Qin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-3554) Generate actual data with specific compression ratio and add multi-thread support in the ProducerPerformance tool.
Date Fri, 17 Nov 2017 19:02:01 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257400#comment-16257400
] 

Jiangjie Qin commented on KAFKA-3554:
-------------------------------------

[~airbots] Thanks for volunteer to help. The patch needs a rebase, again. I guess currently
the reviewers are busy. [~ijuma] do you have time to look at this patch if I do a rebase.
Not sure if we need a KIP for this though. Sometimes we submit KIPs for tooling but sometimes
we don't. I am neutral on this one. Let me know if you prefer a KIP.

> Generate actual data with specific compression ratio and add multi-thread support in
the ProducerPerformance tool.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-3554
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3554
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions: 0.9.0.1
>            Reporter: Jiangjie Qin
>            Assignee: Jiangjie Qin
>             Fix For: 1.1.0
>
>
> Currently the ProducerPerformance always generate the payload with same bytes. This does
not quite well to test the compressed data because the payload is extremely compressible no
matter how big the payload is.
> We can make some changes to make it more useful for compressed messages. Currently I
am generating the payload containing integer from a given range. By adjusting the range of
the integers, we can get different compression ratios. 
> API wise, we can either let user to specify the integer range or the expected compression
ratio (we will do some probing to get the corresponding range for the users)
> Besides that, in many cases, it is useful to have multiple producer threads when the
producer threads themselves are bottleneck. Admittedly people can run multiple ProducerPerformance
to achieve similar result, but it is still different from the real case when people actually
use the producer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message