pulsar-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] ivankelly commented on issue #2942: Intermittent test failure in TestS3Offload
Date Tue, 27 Nov 2018 14:02:27 GMT
ivankelly commented on issue #2942: Intermittent test failure in TestS3Offload
URL: https://github.com/apache/pulsar/issues/2942#issuecomment-442068673
 
 
   The issue is due to the broker being rebooted, but I'm not sure if the client behaviour
is problematic or not. It's nothing to do with offload in any case.
   
   The sequence is:
   1. Client writes a bunch of messages
   2. Offload kicks some segments out to S3.
   3. The test deletes a ledger and reboots the broker to ensure readback.
   4. The client starts reading back the topic from the very start.
   
   The problem happens at the reboot and readback phase. The broker reboots and starts serving
reads. 
   The client manages to read a good bit. 
   ```
   12:52:21.726 [TestNG-method=testPublishOffloadAndConsumeViaCLI-1:org.apache.pulsar.tests.integration.offload.TestS3Offload@149]
INFO  org.apache.pulsar.tests.integration.offload.TestS3Offload - IKDEBUG read back entry
999 msgid 17:999:-1
   12:52:22.134 [pulsar-client-io-23-1:org.apache.pulsar.client.impl.ClientCnx@183] INFO 
org.apache.pulsar.client.impl.ClientCnx - [id: 0xe4bec5be, L:/127.0.0.1:52048 ! R:localhost/127.0.0.1:46664]
Disconnected
   12:52:22.135 [pulsar-client-io-23-1:org.apache.pulsar.client.impl.ConnectionHandler@109]
INFO  org.apache.pulsar.client.impl.ConnectionHandler - [persistent://s3-offload-test-cli-aojz/ns1/topic1]
[my-sub] Closed connection [id: 0xe4bec5be, L:/127.0.0.1:52048 ! R:localhost/127.0.0.1:46664]
-- Will try again in 0.1 s
   12:52:22.236 [pulsar-timer-25-1:org.apache.pulsar.client.impl.ConnectionHandler@112] INFO
 org.apache.pulsar.client.impl.ConnectionHandler - [persistent://s3-offload-test-cli-aojz/ns1/topic1]
[my-sub] Reconnecting after timeout
   12:52:22.243 [pulsar-client-io-23-1:org.apache.pulsar.client.impl.ClientCnx@720] ERROR
org.apache.pulsar.client.impl.ClientCnx - [id: 0xbad8308a, L:/127.0.0.1:52042 - R:localhost/127.0.0.1:46664]
Close connection because received internal-server error org.apache.pulsar.client.api.PulsarClientException:
java.util.concurrent.CompletionException: io.netty.channel.AbstractChannel$AnnotatedConnectException:
syscall:getsockopt(..) failed: Connection refused: pulsar-broker-0/172.18.0.6:6650
   12:52:22.244 [pulsar-client-io-23-1:org.apache.pulsar.client.impl.BinaryProtoLookupService@151]
WARN  org.apache.pulsar.client.impl.BinaryProtoLookupService - [persistent://s3-offload-test-cli-aojz/ns1/topic1]
failed to send lookup request : org.apache.pulsar.client.api.PulsarClientException$LookupException:
org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException:
io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed:
Connection refused: pulsar-broker-0/172.18.0.6:6650
   java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException$LookupException:
org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException:
io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed:
Connection refused: pulsar-broker-0/172.18.0.6:6650
   	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
~[?:1.8.0_191]
   	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
~[?:1.8.0_191]
   	at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647) ~[?:1.8.0_191]
   	at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
~[?:1.8.0_191]
   	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) [?:1.8.0_191]
   	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
[?:1.8.0_191]
   	at org.apache.pulsar.client.impl.ClientCnx.handleLookupResponse(ClientCnx.java:380) [pulsar-client-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
   	at org.apache.pulsar.common.api.PulsarDecoder.channelRead(PulsarDecoder.java:118) [pulsar-client-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
   	at org.apache.pulsar.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
[pulsar-client-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
   	at org.apache.pulsar.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
[pulsar-client-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
   	at org.apache.pulsar.shade.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
[pulsar-client-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
   	at 
   ```
   
   Then the broker realizes it can't get register, so kills itself.
   ```
   12:52:21.754 [main] ERROR org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl
- Broker znode - [/loadbalance/brokers/pulsar-broker-0:8080] is own by different zookeeper-ssession
72259437335805978 
   12:52:21.755 [main] ERROR org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl
- Unable to create znode - [/loadbalance/brokers/pulsar-broker-0:8080] for load balance on
zookeeper 
   ```
   
   When the broker comes back the subscription is reset back to the start (because we hadn't
acked), but the client continues as if its reading from where it left off.
   
   @merlimat @sijie @jiazhai 
   So a couple of questions:
   - Do we expect that users of the consumer keep track to ensure message ids are monotonically
increasing if they want guarantees on the topic?
   - Why can the client read from a broker before the broker has registered?
   
   All info in build: https://builds.apache.org/job/pulsar_precommit_integrationtests/5164/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message