cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Podkowinski (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-13020) Stuck in LEAVING state (Transferring all hints to null)
Date Fri, 10 Mar 2017 13:51:04 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-13020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stefan Podkowinski updated CASSANDRA-13020:
-------------------------------------------
    Assignee: Stefan Podkowinski
      Status: Patch Available  (was: Open)

||3.0||3.11||trunk||
|[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-13020-3.0]|[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-13020-3.11]|[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-13020-trunk]|
|[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-3.0-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-3.11-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-trunk-dtest/]|
|[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-3.0-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-3.11-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-trunk-testall/]|

Anyone up for review? 

Assumptions:
tokenMetadata will only contain public broadcast addresses as keys, so we must not use the
internal IP for retrieving the nodeID
Hints will only be streamed to public addresses in the end anyways

> Stuck in LEAVING state (Transferring all hints to null)
> -------------------------------------------------------
>
>                 Key: CASSANDRA-13020
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13020
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>         Environment: v3.0.9
>            Reporter: Aleksandr Ivanov
>            Assignee: Stefan Podkowinski
>              Labels: decommission, hints
>
> I tried to decommission one node.
> Node sent all data to another node and got stuck in LEAVING state.
> Log message shows Exception in HintsDispatcher thread.
> Could it be reason of stuck in LEAVING state?
> command output:
> {noformat}
> root@cas-node6:~# time nodetool decommission
> error: null
> -- StackTrace --
> java.lang.NullPointerException
>         at java.util.concurrent.ConcurrentHashMap.replaceNode(ConcurrentHashMap.java:1106)
>         at java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:1097)
>         at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:203)
>         at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
>         at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>         at java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3566)
>         at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>         at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>         at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
>         at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
>         at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>         at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
>         at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.transfer(HintsDispatchExecutor.java:168)
>         at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:141)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> real    147m7.483s
> user    0m17.388s
> sys     0m1.968s
> {noformat}
> nodetool netstats:
> {noformat}
> root@cas-node6:~# nodetool netstats
> Mode: LEAVING
> Not sending any streams.
> Read Repair Statistics:
> Attempted: 35082
> Mismatch (Blocking): 18
> Mismatch (Background): 0
> Pool Name                    Active   Pending      Completed   Dropped
> Large messages                  n/a         1              0         0
> Small messages                  n/a         0       16109860       112
> Gossip messages                 n/a         0         287074         0
> {noformat}
> Log:
> {noformat}
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:52:59,467 StorageService.java:1170
- LEAVING: sleeping 30000 ms for batch processing and pending range setup
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:53:39,455 StorageService.java:1170
- LEAVING: replaying batch log and streaming data to other nodes
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:53:39,910 StreamResultFuture.java:87
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Executing streaming plan for Unbootstrap
> INFO  [StreamConnectionEstablisher:1] 2016-12-07 12:53:39,911 StreamSession.java:239
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Starting streaming to /10.10.10.17
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,911 StreamSession.java:232
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:3] 2016-12-07 12:53:39,912 StreamSession.java:232
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,912 StreamSession.java:232
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session does not have any tasks.
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:53:39,912 StorageService.java:1170
- LEAVING: streaming hints to other nodes
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,912 StreamResultFuture.java:183
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session with /10.10.10.10 is complete
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,912 StreamResultFuture.java:183
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session with /10.10.10.11 is complete
> INFO  [StreamConnectionEstablisher:3] 2016-12-07 12:53:39,912 StreamResultFuture.java:183
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session with /10.10.10.12 is complete
> INFO  [StreamConnectionEstablisher:5] 2016-12-07 12:53:39,912 StreamSession.java:239
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Starting streaming to /10.10.10.13
> INFO  [StreamConnectionEstablisher:6] 2016-12-07 12:53:39,912 StreamSession.java:232
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:8] 2016-12-07 12:53:39,912 StreamSession.java:239
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Starting streaming to /10.10.10.14
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:53:39,913 HintsService.java:218
- Resumed hints dispatch
> INFO  [StreamConnectionEstablisher:7] 2016-12-07 12:53:39,913 StreamSession.java:239
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Starting streaming to /10.10.10.15
> INFO  [StreamConnectionEstablisher:6] 2016-12-07 12:53:39,914 StreamResultFuture.java:183
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session with /10.10.10.16 is complete
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,914 StreamCoordinator.java:213
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, ID#0] Beginning stream session with /10.10.10.10
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,914 StreamCoordinator.java:213
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, ID#0] Beginning stream session with /10.10.10.11
> INFO  [StreamConnectionEstablisher:3] 2016-12-07 12:53:39,914 StreamCoordinator.java:213
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, ID#0] Beginning stream session with /10.10.10.12
> INFO  [StreamConnectionEstablisher:6] 2016-12-07 12:53:39,917 StreamCoordinator.java:213
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, ID#0] Beginning stream session with /10.10.10.16
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,917 StreamSession.java:232
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,917 StreamSession.java:232
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:6] 2016-12-07 12:53:39,917 StreamSession.java:239
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Starting streaming to /10.10.10.18
> INFO  [StreamConnectionEstablisher:3] 2016-12-07 12:53:39,917 StreamSession.java:239
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Starting streaming to /10.10.10.19
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,917 StreamResultFuture.java:183
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session with /10.10.10.20 is complete
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,917 StreamResultFuture.java:183
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session with /10.10.10.21 is complete
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,920 StreamCoordinator.java:213
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, ID#0] Beginning stream session with /10.10.10.20
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,920 StreamCoordinator.java:213
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, ID#0] Beginning stream session with /10.10.10.21
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,920 StreamSession.java:232
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,920 StreamSession.java:239
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Starting streaming to /10.10.10.22
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,921 StreamResultFuture.java:183
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session with /10.10.10.23 is complete
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,921 StreamCoordinator.java:213
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, ID#0] Beginning stream session with /10.10.10.23
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,921 StreamSession.java:232
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,921 StreamResultFuture.java:183
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session with /10.10.10.24 is complete
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,921 StreamCoordinator.java:213
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, ID#0] Beginning stream session with /10.10.10.24
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,921 StreamSession.java:232
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,922 StreamResultFuture.java:183
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session with /10.10.10.25 is complete
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,922 StreamCoordinator.java:213
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, ID#0] Beginning stream session with /10.10.10.25
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,922 StreamSession.java:239
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Starting streaming to /10.10.10.26 through
/192.168.101.70
> INFO  [HintsDispatcher:1] 2016-12-07 12:53:39,926 HintsDispatchExecutor.java:140 - Transferring
all hints to null
> ERROR [HintsDispatcher:1] 2016-12-07 12:53:39,928 CassandraDaemon.java:205 - Exception
in thread Thread[HintsDispatcher:1,1,main]
> java.lang.NullPointerException: null
>         at java.util.concurrent.ConcurrentHashMap.replaceNode(ConcurrentHashMap.java:1106)
~[na:1.8.0_92]
>         at java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:1097)
~[na:1.8.0_92]
>         at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:203)
~[apache-cassandra-3.0.9.jar:3.0.9]
>         at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) ~[na:1.8.0_92]
>         at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
~[na:1.8.0_92]
>         at java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3566)
~[na:1.8.0_92]
>         at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[na:1.8.0_92]
>         at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
~[na:1.8.0_92]
>         at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
~[na:1.8.0_92]
>         at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
~[na:1.8.0_92]
>         at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0_92]
>         at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) ~[na:1.8.0_92]
>         at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.transfer(HintsDispatchExecutor.java:168)
~[apache-cassandra-3.0.9.jar:3.0.9]
>         at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:141)
~[apache-cassandra-3.0.9.jar:3.0.9]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_92]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_92]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[na:1.8.0_92]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_92]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:40,093 StreamResultFuture.java:169
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb ID#0] Prepare completed. Receiving 0 files(0
bytes), sending 2 files(
> 133006 bytes)
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:40,097 StreamCoordinator.java:213
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, ID#0] Beginning stream session with /10.10.10.26
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:40,097 StreamSession.java:232
- [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session does not have any tasks.
> ...
> INFO  [IndexSummaryManager:1] 2016-12-07 14:51:44,761 IndexSummaryRedistribution.java:74
- Redistributing index summaries
> INFO  [STREAM-IN-/10.10.10.27] 2016-12-07 15:20:06,006 StreamResultFuture.java:183 -
[Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] Session with /10.10.10.27 is complete
> INFO  [STREAM-IN-/10.10.10.27] 2016-12-07 15:20:06,007 StreamResultFuture.java:215 -
[Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] All sessions completed
> {noformat}
> all real IPs replaced with IPs from 10.10.10.x network in log



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message