kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guozhang Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-1326) New consumer checklist
Date Thu, 02 Oct 2014 21:48:34 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157247#comment-14157247

Guozhang Wang commented on KAFKA-1326:

A couple of more points we have seen from the old consumer that needs to be carefully addressed
in the new consumer:

1. Memory management / decompression: in the old consumer, de-compression can easily allocate
a huge amount of memory within a very short time, we need to add similar memory management
as new producer to bound the memory usage, and at the same time make sure that we do not allocate
more memory than necessary while doing the de-compression.

2. Max fetch size: since the old consumer's max fetch size is fixed by a config, it as a result
requires the similar config on the broker / producer. It should be easy to let the new consumer
dynamically increase its max fetch size when receiving a single partial message so we can
get rid of the new messages.

3. Problems with consumer iterator: today's consumer iterator has several inconsistencies
with the Java iterator principles, for example KAFKA-520, and that exception thrown while
de-serializing the message will cause it to be skipped since iter.next() has already been
called, etc. Although we are not going with the stream-based API in the new consumer but instead
use a non-blocking pooling model, we need to make sure these usage pattern issues are not
carried in the new API.

> New consumer checklist
> ----------------------
>                 Key: KAFKA-1326
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1326
>             Project: Kafka
>          Issue Type: New Feature
>          Components: consumer
>    Affects Versions: 0.9.0
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>              Labels: feature
> We will use this JIRA to track the list of issues to resolve to get a working new consumer
client. The consumer client can work in phases -
> 1. Add new consumer APIs and configs
> 2. Refactor Sender. We will need to use some common APIs from Sender.java (https://issues.apache.org/jira/browse/KAFKA-1316)
> 3. Add metadata fetch and refresh functionality to the consumer (This will require https://issues.apache.org/jira/browse/KAFKA-1316)
> 4. Add functionality to support subscribe(TopicPartition...partitions). This will add
SimpleConsumer functionality to the new consumer. This does not include any group management
related work.
> 5. Add ability to commit offsets to Kafka. This will include adding functionality to
the commit()/commitAsync()/committed() APIs. This still does not include any group management
related work.
> 6. Add functionality to the offsetsBeforeTime() API.
> 7. Add consumer co-ordinator election to the server. This will only add a new module
for the consumer co-ordinator, but not necessarily all the logic to do group management. 
> At this point, we will have a fully functional standalone consumer and a server side
co-ordinator module. This will be a good time to start adding group management functionality
to the server and consumer.
> 8. Add failure detection capability to the consumer when group management is used. This
will not include any rebalancing logic, just the ability to detect failures using session.timeout.ms.
> 9. Add rebalancing logic to the server and consumer. This will be a tricky and potentially
large change since it will involve implementing the group management protocol.
> 10. Add system tests for the new consumer
> 11. Add metrics 
> 12. Convert mirror maker to use the new consumer.
> 13. Convert perf test to use the new consumer
> 14. Performance testing and analysis.
> 15. Review and fine tune log4j logging

This message was sent by Atlassian JIRA

View raw message