pulsar-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] rdhabalia opened a new pull request #2909: Handle broker shutdown for already deleted load-balancer znode
Date Thu, 01 Nov 2018 22:32:40 GMT
rdhabalia opened a new pull request #2909: Handle broker shutdown for already deleted load-balancer
znode
URL: https://github.com/apache/pulsar/pull/2909
 
 
   ### Motivation
   
   If somehow, broker's load-balancer znode gets deleted (by some script/manually) then broker's
shutdown creates split-brain for sometime and bundle owned by new broker will fail to update
managed-ledger znode with exception: 
   ```
   22:24:33.258 [bookkeeper-ml-workers-OrderedExecutor-12-0] ERROR org.apache.bookkeeper.mledger.impl.ManagedCursorImpl
- [prop/ns/persistent/topic][usnc1] Metadata ledger creation failed
   : 
   org.apache.bookkeeper.mledger.ManagedLedgerException$BadVersionException: org.apache.zookeeper.KeeperException$BadVersionException:
KeeperErrorCode = BadVersion
   ```
   
   It's because, broker shutdown immediately completes if broker's load-balancer node doesn't
exist and it skips unloading bundle which will be again own by new broker. But closing `ManagedLedgerFactory`
will try to change zk-node version which will cause `BadVersion exception` into new broker.
   
   ```
   08:53:07.874 [shutdown-thread-49-1] INFO  org.apache.pulsar.broker.web.WebService - Web
service closed
   08:53:07.874 [shutdown-thread-49-1] INFO  org.apache.pulsar.broker.service.BrokerService
- Shutting down Pulsar Broker service
   08:53:07.880 [shutdown-thread-49-1] ERROR org.apache.pulsar.broker.service.BrokerService
- Failed to disable broker from loadbalancer list org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /loadbalance/brokers/my-broker.com:4080
   org.apache.pulsar.broker.PulsarServerException: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /loadbalance/brokers/my-broker.com:4080
           at org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl.disableBroker(ModularLoadManagerImpl.java:565)
~[pulsar-broker-2.2.0.jar:2.2.0]
           at org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerWrapper.disableBroker(ModularLoadManagerWrapper.java:49)
~[pulsar-broker-2.2.0.jar:2.2.0]
           at org.apache.pulsar.broker.service.BrokerService.unloadNamespaceBundlesGracefully(BrokerService.java:427)
~[pulsar-broker-2.2.0.jar:2.2.0]
           at org.apache.pulsar.broker.service.BrokerService.close(BrokerService.java:387)
~[pulsar-broker-2.2.0.jar:2.2.0]
           at org.apache.pulsar.broker.PulsarService.close(PulsarService.java:221) ~[pulsar-broker-2.2.0.jar:2.2.0]
           at org.apache.pulsar.broker.MessagingServiceShutdownHook.lambda$0(MessagingServiceShutdownHook.java:62)
~[pulsar-broker-2.2.0.jar:2.2.0]
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:1.8.0_131]
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:1.8.0_131]
   08:53:07.916 [shutdown-thread-49-1] INFO  org.apache.pulsar.broker.service.BrokerService
- Broker service completely shut down
   08:53:07.917 [shutdown-thread-49-1] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl
- Closing 18191 ledgers
   08:53:07.916 [shutdown-thread-49-1] INFO  org.apache.pulsar.broker.service.BrokerService
- Broker service completely shut down
   08:53:07.917 [shutdown-thread-49-1] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl
- Closing 18191 ledgers
   ```
   So, broker should be more resilient in such cases to avoid such failures.
   
   ### Modifications
   
   Broker will unload the namespace bundle gracefully even if broker's load-balancer znode
doesn't exist. and then close zk-client connection and release namespace-bundle ephemeral
znode.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message