pulsar-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] rdhabalia opened a new pull request #3556: [pulsar-broker] cursor: safe guard to avoid cursor-znode after cursor is closed
Date Fri, 08 Feb 2019 22:19:27 GMT
rdhabalia opened a new pull request #3556: [pulsar-broker] cursor: safe guard to avoid cursor-znode
after cursor is closed
URL: https://github.com/apache/pulsar/pull/3556
 
 
   ### Motivation
   
   Sometimes while shutting down the broker, unknown uncompleted callback tries to update
`managed-cursor` even if managed-cursor is closed and because of that topic which is already
loaded by new broker sees below exception while updating managed-cursor-metadata and cursor
is not able to update mark-delete position until it unloaded again.
   ```
   09:22:02.641 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  org.apache.bookkeeper.mledger.impl.MetaStoreImplZookeeper
- [sample/global/ns1/persistent/TopicBadVersion] [pulsar.repl.us-east] Updating cursor info
ledgerId=14458739936 mark-delete=12345678900:5306
   00:00:05.642 [bookkeeper-ml-workers-OrderedExecutor-14-0] WARN  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl
- [sample/global/ns1/persistent/TopicBadVersion] Failed to update consumer pulsar.repl.us-east
   org.apache.bookkeeper.mledger.ManagedLedgerException$BadVersionException: org.apache.zookeeper.KeeperException$BadVersionException:
KeeperErrorCode = BadVersion
   Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode =
BadVersion
           at org.apache.zookeeper.KeeperException.create(KeeperException.java:118) ~[pulsar-broker-2.2.5-.jar:2.2.5-]
           at org.apache.bookkeeper.mledger.impl.MetaStoreImplZookeeper.lambda$null$103(MetaStoreImplZookeeper.java:287)
~[managed-ledger-original-2.2.5-.jar:2.2.5-]
           at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [managed-ledger-original-2.2.5-.jar:2.2.5-]
           at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [bookkeeper-common-4.7.2.jar:4.7.2]
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:1.8.0_131]
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:1.8.0_131]
           at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[netty-all-4.1.22.Final.jar:4.1.22.Final]
   ```
   
   It seems it always happens for replicator-cursor and not sure exact step to reproduce but
it might happen when bookies might be in unstable state (not sure about the root cause). one
of the example:
   
   Behavior:
   broker-1 unloads the topic by `09:21:59.451` and broker-2 loads the same topic at `09:22:00.135`.
But broker-1 has log of updating metadata-znode at `09:22:05.207`. So, broker-2 fails to update
cursor-metadata with exception `BadVersionException`
   ```
   Previous-broker: broker-1
   09:21:58.855 [shutdown-thread-49-1] INFO  org.apache.pulsar.broker.namespace.OwnedBundle
- Disabling ownership: sample/global/ns1/0x2aaaaaa8_0x35555552
   :
   :
   09:21:59.668 [shutdown-thread-49-1] INFO  org.apache.pulsar.broker.namespace.OwnedBundle
- Unloading sample/global/ns1/0x2aaaaaa8_0x35555552 namespace-bundle with 345 topics completed
in 812.0 ms
   :
   09:21:58.930 [shutdown-thread-49-1] INFO  org.apache.pulsar.broker.service.BrokerService
- [persistent://sample/global/ns1/TopicBadVersion] Unloading topic
   :
   09:21:59.451 [bookkeeper-ml-workers-OrderedExecutor-14-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl
- [sample/global/ns1/persistent/TopicBadVersion][pulsar.repl.us-east] Closed cursor at md-position=12345678900:5306
   09:21:59.451 [bookkeeper-ml-workers-OrderedExecutor-14-0] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic
- [persistent://sample/global/ns1/TopicBadVersion] Topic closed
   :
   09:22:05.207 [BookKeeperClientWorker-OrderedExecutor-2-0] WARN  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl
- [sample/global/ns1/persistent/TopicBadVersion] Error updating cursor pulsar.repl.us-east
position 12345678900:5306 in meta-ledger 14451380450: BookKeeper client is closed
   09:22:05.207 [BookKeeperClientWorker-OrderedExecutor-2-0] INFO  org.apache.bookkeeper.mledger.impl.MetaStoreImplZookeeper
- [sample/global/ns1/persistent/TopicBadVersion] [pulsar.repl.us-east] Updating cursor info
ledgerId=-1 mark-delete=12345678900:5306
   ```
   
   ```
   Current-broker: broker-2
   09:22:00.135 [pulsar-ordered-OrderedExecutor-7-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl
- Opening managed ledger sample/global/ns1/persistent/TopicBadVersion
   ```
   
   ### Modification
   
   It happens when someone tries to update cursor metadata by calling `persistPositionMetaStore(..)`
so, it should have safe guard and should not update metadata if cursor is already closed.
   
   ### Result
   It avoids any unexpected behavior while updating cursor-metadata in closed cursor state.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message