kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5070) org.apache.kafka.streams.errors.LockException: task [0_18] Failed to lock the state directory: /opt/rocksdb/pulse10/0_18
Date Fri, 21 Jul 2017 04:49:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095773#comment-16095773
] 

Matthias J. Sax commented on KAFKA-5070:
----------------------------------------

Thanks for looping back [~kchen]!

Just to clarify: The WARN level is correct, as the lock issue is supposed to resolve itself.
If it does not resolve itself, there is a bug (but it does not justify to classify the lock
exception as ERROR).

We put a couple of bug fixed for a potential {{0.10.2.2}} release but it in unclear is there
will be a release like this. Those fixes are also in {{0.11.0.0}} that got release recently.
Can you try to upgrade your Streams application to {{0.11.0.0}}? (Note that you don't need
to upgrade the brokers for this.)

If the issue is still in {{0.11.0.0}} it would be good to know and also good to get DEBUG
logs. This would help us to put a fix into {{0.11.0.1}}. Thanks a lot for your support nailing
this down!

> org.apache.kafka.streams.errors.LockException: task [0_18] Failed to lock the state directory:
/opt/rocksdb/pulse10/0_18
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-5070
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5070
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.10.2.0
>         Environment: Linux Version
>            Reporter: Dhana
>            Assignee: Matthias J. Sax
>         Attachments: RocksDB_LockStateDirec.7z
>
>
> Notes: we run two instance of consumer in two difference machines/nodes.
> we have 400 partitions. 200  stream threads/consumer, with 2 consumer.
> We perform HA test(on rebalance - shutdown of one of the consumer/broker), we see this
happening
> Error:
> 2017-04-05 11:36:09.352 WARN  StreamThread:1184 StreamThread-66 - Could not create task
0_115. Will retry.
> org.apache.kafka.streams.errors.LockException: task [0_115] Failed to lock the state
directory: /opt/rocksdb/pulse10/0_115
> 	at org.apache.kafka.streams.processor.internals.ProcessorStateManager.<init>(ProcessorStateManager.java:102)
> 	at org.apache.kafka.streams.processor.internals.AbstractTask.<init>(AbstractTask.java:73)
> 	at org.apache.kafka.streams.processor.internals.StreamTask.<init>(StreamTask.java:108)
> 	at org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
> 	at org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
> 	at org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
> 	at org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
> 	at org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
> 	at org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
> 	at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
> 	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
> 	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
> 	at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
> 	at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
> 	at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
> 	at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
> 	at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message