kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guozhang Wang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (KAFKA-2486) New consumer performance
Date Fri, 28 Aug 2015 18:32:46 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-2486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720110#comment-14720110
] 

Guozhang Wang edited comment on KAFKA-2486 at 8/28/15 6:31 PM:
---------------------------------------------------------------

OK, there are a few mixed expressions in my previous comment. First of all "retry.backoff"
should be used for failed requests, but not requests returning no data, so I did not catch
that when reviewing and this needs to be fixed for sure. No question about it.

Now return to performance profiling, by default the fetch requests are long-polls:

FETCH_MIN_BYTES_CONFIG = 1024
FETCH_MAX_WAIT_MS_CONFIG = 500

but I remember in consumer performance they were once overriden to 1 and 0 (I saw those overriden
values were removed now, but I am pretty confident they were over-written some time ago),
in which case we may want to back-off a bit instead of DDoSing. The question is really for
performance profiling where data keeps filling in from a continuously sending producer and
there is no processing time after consumption, what should be the optimal configs. I think
long polls should still be preferable for throughput, but will give sub-optimal latencies.


was (Author: guozhang):
OK, there are a few mixed expressions in my previous comment. First of all "retry.backoff"
should be used for failed requests, but not requests returning no data, so I did not catch
that when reviewing and this needs to be fixed for sure. No question about it.

Now return to performance profiling, by default the fetch requests are long-polls:

FETCH_MIN_BYTES_CONFIG = 1024
FETCH_MAX_WAIT_MS_CONFIG = 500

but I remember in consumer performance they were once overriden to 1 and 0 (I saw those overriden
values were removed now, but I am pretty confident they were over-written some time ago),
in which case we may want to back-off a bit instead of DDoSing. The question is really for
performance profiling where data keeps filling in from a continuously sending producer and
there is no processing time after consumption, what should be the optimal configs. I think
long polls should still be preferable for throughput, but maybe will give sub-optimal latencies.

> New consumer performance
> ------------------------
>
>                 Key: KAFKA-2486
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2486
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: consumer
>            Reporter: Ewen Cheslack-Postava
>            Assignee: Jason Gustafson
>             Fix For: 0.8.3
>
>
> The new consumer was previously reaching getting good performance. However, a recent
report on the mailing list indicates it's dropped significantly. After evaluation, even with
a local broker it seems to only be reaching a 2-10MB/s, compared to 600+MB/s previously. Before
release, we should get the performance back on par.
> Some details about where the regression occurred from the mailing list http://mail-archives.apache.org/mod_mbox/kafka-dev/201508.mbox/%3CCAAdKFaE8bPSeWZf%2BF9RuA-xZazRpBrZG6vo454QLVHBAk_VOJg%40mail.gmail.com%3E
:
> bq. At 49026f11781181c38e9d5edb634be9d27245c961 (May 14th), we went from good performance
-> an error due to broker apparently not accepting the partition assignment strategy. Since
this commit seems to add heartbeats and the server side code for partition assignment strategies,
I assume we were missing something on the client side and by filling in the server side, things
stopped working.
> bq. On either 84636272422b6379d57d4c5ef68b156edc1c67f8 or a5b11886df8c7aad0548efd2c7c3dbc579232f03
(July 17th), I am able to run the perf test again, but it's slow -- ~10MB/s for me vs the
2MB/s Jay was seeing, but that's still far less than the 600MB/s I saw on the earlier commits.
> Ideally we would also at least have a system test in place for the new consumer, even
if regressions weren't automatically detected. It would at least allow for manually checking
for regressions. This should not be difficult since there are already old consumer performance
tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message