kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xavier Léauté (JIRA) <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5079) ProducerBounceTest fails occasionally with a SocketTimeoutException
Date Mon, 17 Apr 2017 22:12:41 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971735#comment-15971735

Xavier Léauté commented on KAFKA-5079:

A possible workaround for failing PR builds would be to execute each test build in its own
container to avoid the port conflicts.
Are there any plans to support docker builds or something equivalent within the Apache Jenkins
build infrastructure? 

> ProducerBounceTest fails occasionally with a SocketTimeoutException
> -------------------------------------------------------------------
>                 Key: KAFKA-5079
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5079
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Apurva Mehta
> {noformat}
> java.net.SocketTimeoutException
> 	at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> 	at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> 	at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> 	at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:85)
> 	at kafka.network.BlockingChannel.readCompletely(BlockingChannel.scala:129)
> 	at kafka.network.BlockingChannel.receive(BlockingChannel.scala:120)
> 	at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:100)
> 	at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:84)
> 	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:133)
> 	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:133)
> 	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:133)
> 	at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:32)
> 	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:132)
> 	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:132)
> 	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:132)
> 	at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:32)
> 	at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:131)
> 	at kafka.api.ProducerBounceTest$$anonfun$2.apply(ProducerBounceTest.scala:116)
> 	at kafka.api.ProducerBounceTest$$anonfun$2.apply(ProducerBounceTest.scala:113)
> {noformat}
> This is expected occasionally, since the ports are preallocated and the brokers are bounced
in quick succession. Here is the relevant comment from the code: 
> {noformat}
>   // This is the one of the few tests we currently allow to preallocate ports, despite
the fact that this can result in transient
>   // failures due to ports getting reused. We can't use random ports because of bad behavior
that can result from bouncing
>   // brokers too quickly when they get new, random ports. If we're not careful, the client
can end up in a situation
>   // where metadata is not refreshed quickly enough, and by the time it's actually trying
to, all the servers have
>   // been bounced and have new addresses. None of the bootstrap nodes or current metadata
can get them connected to a
>   // running server.
>   //
>   // Since such quick rotation of servers is incredibly unrealistic, we allow this one
test to preallocate ports, leaving
>   // a small risk of hitting errors due to port conflicts. Hopefully this is infrequent
enough to not cause problems.
> {noformat}
> We should try to look into handling this exception better so that the test doesn't fail

This message was sent by Atlassian JIRA

View raw message