kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Gustafson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-4362) Consumer can fail after reassignment of the offsets topic partition
Date Wed, 02 Nov 2016 05:41:58 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15627824#comment-15627824

Jason Gustafson commented on KAFKA-4362:

[~jjkoshy] Good find. Can you clarify what you mean when you say this causes offset commits
to perpetually fail? I would expect that subsequent offset commits after partition assignment
completes would get the COORDINATOR_NOT_AVAILABLE error. Also, that seems like the proper
error code instead of trying to handle UNKNOWN on the client side?

> Consumer can fail after reassignment of the offsets topic partition
> -------------------------------------------------------------------
>                 Key: KAFKA-4362
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4362
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions:
>            Reporter: Joel Koshy
>            Assignee: Mayuresh Gharat
> When a consumer offsets topic partition reassignment completes, an offset commit shows
> {code}
> java.lang.IllegalArgumentException: Message format version for partition 100 not found
>     at kafka.coordinator.GroupMetadataManager$$anonfun$14.apply(GroupMetadataManager.scala:633)
>     at kafka.coordinator.GroupMetadataManager$$anonfun$14.apply(GroupMetadataManager.scala:633)
>     at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.4.jar:?]
>     at kafka.coordinator.GroupMetadataManager.kafka$coordinator$GroupMetadataManager$$getMessageFormatVersionAndTimestamp(GroupMetadataManager.scala:632)
>     at 
> ...
> {code}
> The issue is that the replica has been deleted so the {{GroupMetadataManager.getMessageFormatVersionAndTimestamp}}
throws this exception instead which propagates as an unknown error.
> Unfortunately consumers don't respond to this and will fail their offset commits.
> One workaround in the above situation is to bounce the cluster - the consumer will be
forced to rediscover the group coordinator.
> (Incidentally, the message incorrectly prints the number of partitions instead of the
actual partition.)

This message was sent by Atlassian JIRA

View raw message