kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guozhang Wang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (KAFKA-5957) Producer IllegalStateException due to second deallocate after aborting a batch
Date Sat, 23 Sep 2017 04:28:18 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Guozhang Wang updated KAFKA-5957:
---------------------------------
    Fix Version/s:     (was: 1.0.0)
                   1.1.0

> Producer IllegalStateException due to second deallocate after aborting a batch
> ------------------------------------------------------------------------------
>
>                 Key: KAFKA-5957
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5957
>             Project: Kafka
>          Issue Type: Bug
>          Components: producer 
>            Reporter: Jason Gustafson
>            Assignee: Jason Gustafson
>            Priority: Critical
>             Fix For: 1.1.0
>
>
> Saw this recently in a system test failure:
> {code}
> [2017-09-21 05:04:52,033] ERROR [Producer clientId=producer-1, transactionalId=my-second-transactional-id]
Aborting producer batches due to fatal error (org.apache.kafka.clients.producer.internals.Sender)
> org.apache.kafka.common.KafkaException: The client hasn't received acknowledgment for
some previously sent messages and can no longer retry them. It isn't safe to continue.
>         at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:211)
>         at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:164)
>         at java.lang.Thread.run(Thread.java:745)
> [2017-09-21 05:04:52,033] TRACE Aborting batch for partition output-topic-2 (org.apache.kafka.clients.producer.internals.ProducerBatch)
> org.apache.kafka.common.KafkaException: The client hasn't received acknowledgment for
some previously sent messages and can no longer retry them. It isn't safe to continue.
>         at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:211)
>         at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:164)
>         at java.lang.Thread.run(Thread.java:745)
> [2017-09-21 05:04:52,134] TRACE [Producer clientId=producer-1, transactionalId=my-second-transactional-id]
Not sending transactional request (type=EndTxnRequest, transactionalId=my-second-transactional-id,
producerId=1000, producerEpoch=0, result=COMMIT) because we are in an error state (org.apache.kafka.clients.producer.internals.TransactionManager)
> [2017-09-21 05:04:52,134] INFO [Producer clientId=producer-1, transactionalId=my-second-transactional-id]
Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. (org.apache.kafka.clients.producer.KafkaProducer)
> [2017-09-21 05:04:52,134] DEBUG [Producer clientId=producer-1, transactionalId=my-second-transactional-id]
Beginning shutdown of Kafka producer I/O thread, sending remaining records. (org.apache.kafka.clients.producer.internals.Sender)
> [2017-09-21 05:04:52,360] TRACE [Producer clientId=producer-1, transactionalId=my-second-transactional-id]
Received produce response from node 1 with correlation id 245 (org.apache.kafka.clients.producer.internals.Sender)
> [2017-09-21 05:04:52,360] DEBUG [Producer clientId=producer-1, transactionalId=my-second-transactional-id]
ProducerId: 1000; Set last ack'd sequence number for topic-partition output-topic-2 to 136
(org.apache.kafka.clients.producer.internals.Sender)
> [2017-09-21 05:04:52,360] TRACE Successfully produced messages to output-topic-2 with
base offset 387. (org.apache.kafka.clients.producer.internals.ProducerBatch)
> [2017-09-21 05:04:52,360] DEBUG ProduceResponse returned for output-topic-2 after batch
had already been aborted. (org.apache.kafka.clients.producer.internals.ProducerBatch)
> [2017-09-21 05:04:52,360] ERROR [Producer clientId=producer-1, transactionalId=my-second-transactional-id]
Uncaught error in request completion: (org.apache.kafka.clients.NetworkClient)
> java.lang.IllegalStateException: Remove from the incomplete set failed. This should be
impossible.
>         at org.apache.kafka.clients.producer.internals.IncompleteBatches.remove(IncompleteBatches.java:44)
>         at org.apache.kafka.clients.producer.internals.RecordAccumulator.deallocate(RecordAccumulator.java:612)
>         at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:585)
>         at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:561)
>         at org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:475)
>         at org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74)
>         at org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:685)
>         at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:101)
>         at org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:481)
>         at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:473)
>         at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:225)
>         at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:177)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> Although we allow a batch to be aborted before it returns, we are not careful about preventing
a second call to {{deallocate()}} which causes this error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message