kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sreenivasulu Nallapati (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-2565) Offset Commit is not working if multiple consumers try to commit the offset
Date Mon, 12 Oct 2015 08:21:05 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952795#comment-14952795
] 

Sreenivasulu Nallapati commented on KAFKA-2565:
-----------------------------------------------

This happens basically in this scenario.
Our batch works as below
1. Even though we are running our consumer in batch mode, we are using High Level Consumer
for scalability.
2. Opens kafka consumer connector
3. Start of the consumer  it will identify the latest offset(consumer stop offset -cso) for
each partition
4. It will start reading messages till the cso for this batch.
5. Write the messages to a temp files on FTP server. Once we process all the messages, move
the FTP server temp files to actual location on FTP server
6. If all data transfer is success, commit the offset to zookeeper.
7. If we run Single consumer to process all the partitions, there is no issue :)
8. If we start multiple consumers for a single topic ( say three consumers, three partitions.
one consumer for one partition) the problem starts.
What we observed here is: Out of three consumers if one consumer(c1-partition1) finishes its
processing ahead of other two. The zookeeper sees a re balancing and start re balancing partition1
with one of other two running consumers(while zookeeper doing this task other consumers consumed
all the messages and in the process of moving the temp files on FTP server). We are not closing
the consumer connector till end of the batch. The re balancing is happening after we stopped
consuming the message.

Is there something we are missing here or doing wrong




> Offset Commit is not working if multiple consumers try to commit the offset
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-2565
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2565
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 0.8.1, 0.8.2.1, 0.8.2.2
>            Reporter: Sreenivasulu Nallapati
>            Assignee: Neha Narkhede
>
> We are seeing some strange behaviour with commitOffsets() method of kafka.javaapi.consumer.ConsumerConnector.
We committing the offsets to zookeeper at the end of the consumer batch. We are running multiple
consumers for the same topic.
> Test details: 
> 1.	Created a topic with three partitions
> 2.	Started three consumers (cronjob) at the same time. The aim is that each consumer
to process one partition.
> 3.	Each consumer at the end of the batch, it will call the commitOffsets() method on
kafka.javaapi.consumer.ConsumerConnector
> 4.	The offsets are getting properly updated in zookeeper if we run the consumers for
small set (say 1000 messages) of messages.
> 5.	But for larger number of messages, commit offset is not working as expected…sometimes
only two offsets are properly committing and other one remains as it was.
> 6.	Please see the below example
> Partition: 0 Latest Offset: 1057585
> Partition: 1 Latest Offset: 1057715
> Partition: 2 Latest Offset: 1057590
> Earliest Offset after all consumers completed: {0=1057585, 1=724375, 2=1057590}
> Highlighted in red supposed to be committed as 1057715 but it did not.
> Please check if it is bug with multiple consumers. When multiple consumers are trying
to update the same path in Zookeper, is there any synchronization issue?
> Kafka Cluster details
> 1 zookeeper
> 3 brokers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message