kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "huxihx (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-6345) NetworkClient.inFlightRequestCount() is not thread safe, causing ConcurrentModificationExceptions when sensors are read
Date Thu, 14 Dec 2017 00:25:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290140#comment-16290140

huxihx commented on KAFKA-6345:

A easy-thinking solution is to create a safe count method only for JMX metrics. The safe version
creates a live snapshot for the map by deep copying each map entries.

> NetworkClient.inFlightRequestCount() is not thread safe, causing ConcurrentModificationExceptions
when sensors are read
> -----------------------------------------------------------------------------------------------------------------------
>                 Key: KAFKA-6345
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6345
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 1.0.0
>            Reporter: radai rosenblatt
> example stack trace (code is ~0.10.2.*)
> {code}
> java.util.ConcurrentModificationException: java.util.ConcurrentModificationException
> 	at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> 	at java.util.HashMap$ValueIterator.next(HashMap.java:1458)
> 	at org.apache.kafka.clients.InFlightRequests.inFlightRequestCount(InFlightRequests.java:109)
> 	at org.apache.kafka.clients.NetworkClient.inFlightRequestCount(NetworkClient.java:382)
> 	at org.apache.kafka.clients.producer.internals.Sender$SenderMetrics$1.measure(Sender.java:480)
> 	at org.apache.kafka.common.metrics.KafkaMetric.value(KafkaMetric.java:61)
> 	at org.apache.kafka.common.metrics.KafkaMetric.value(KafkaMetric.java:52)
> 	at org.apache.kafka.common.metrics.JmxReporter$KafkaMbean.getAttribute(JmxReporter.java:183)
> 	at org.apache.kafka.common.metrics.JmxReporter$KafkaMbean.getAttributes(JmxReporter.java:193)
> 	at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttributes(DefaultMBeanServerInterceptor.java:709)
> 	at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttributes(JmxMBeanServer.java:705)
> {code}
> looking at latest trunk, the code is still vulnerable:
> # NetworkClient.inFlightRequestCount() eventually iterates over InFlightRequests.requests.values(),
which is backed by a (non-thread-safe) HashMap
> # this will be called from the "requests-in-flight" sensor's measure() method (Sender.java
line  ~765 in SenderMetrics ctr), which would be driven by some thread reading JMX values
> # HashMap in question would also be updated by some client io thread calling NetworkClient.doSend()
- which calls into InFlightRequests.add())
> i guess the only upside is that this exception will always happen on the thread reading
the JMX values and never on the actual client io thread ...

This message was sent by Atlassian JIRA

View raw message