cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Updated: (CASSANDRA-1432) java.util.NoSuchElementException when returning a node to the cluster
Date Thu, 26 Aug 2010 21:40:54 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-1432:
--------------------------------------

             Assignee: Jonathan Ellis
        Fix Version/s: 0.6.6
    Affects Version/s: 0.6.5
             Priority: Minor  (was: Major)
             Reviewer: brandon.williams

the HH paging code doesn't handle zero columns found for the row, which could happen if cleanup
removes the rows before they are handed off or if they are tombstoned and aged out during
compaction.

patches attached for 0.6 and trunk

> java.util.NoSuchElementException when returning a node to the cluster
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-1432
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1432
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6.5, 0.7 beta 1
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6.6, 0.7 beta 2
>
>         Attachments: 1432-06.txt, 1432-trunk.txt
>
>
> I'm running the v0.7-beta1 in a 4 nodes cluster and just doing some simple testing. One
of the nodes had been down (machine off, unclean shutdown) for an hour or so not sure how
many writes were going on, when I bought it back up this message appears in the other 3 nodes...
> INFO [GOSSIP_STAGE:1] 2010-08-25 19:29:51,199 Gossiper.java (line 584) Node /192.168.34.27
has restarted, now UP again
>  INFO [HINTED-HANDOFF-POOL:1] 2010-08-25 19:29:51,200 HintedHandOffManager.java (line
191) Started hinted handoff for endpoint /192.168.34.27
>  INFO [GOSSIP_STAGE:1] 2010-08-25 19:29:51,201 StorageService.java (line 636) Node /192.168.34.27
state jump to normal
>  INFO [GOSSIP_STAGE:1] 2010-08-25 19:29:51,201 StorageService.java (line 643) Will not
change my token ownership to /192.168.34.27
> ERROR [HINTED-HANDOFF-POOL:1] 2010-08-25 19:29:51,640 CassandraDaemon.java (line 82)
Uncaught exception in thread Thread[HINTED-HANDOFF-POOL:1,5,main]
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.NoSuchElementException
>         at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>         at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:87)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.RuntimeException: java.util.NoSuchElementException
>         at orgapache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         ... 2 more
> Caused by: java.util.NoSuchElementException
>         at java.util.concurrent.ConcurrentSkipListMap.lastKey(ConcurrentSkipListMap.java:1981)
>         at java.util.concurrent.ConcurrentSkipListMap$KeySet.last(ConcurrentSkipListMap.java:2331)
>         at org.apache.cassandra.db.HintedHandOffManager.sendMessage(HintedHandOffManager.java:121)
>         at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:218)
>         at org.apache.cassandra.db.HintedHandOffManager.access$000(HintedHandOffManager.java:78)
>         at org.apache.cassandra.db.HintedHandOffManager$1.runMayThrow(HintedHandOffManager.java:296)
not sure how many writes were going on
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         ... 6 more
> On the machine that was off (34.27) there are no errors in the logs, and here are the
entries for around the same time...
>  INFO [main] 2010-08-25 19:29:50,679 CommitLog.java (line 340) Recovery complete
>  INFO [main] 2010-08-25 19:29:50,769 CommitLog.java (line 180) Log replay complete
>  INFO [main] 2010-08-25 19:29:50,797 StorageService.java (line 342) Cassandra version:
0.7.0-beta1-SNAPSHOT
>  INFO [main] 2010-08-25 19:29:50,797 StorageService.java (line 343) Thrift API version:
10.0.0
>  INFO [main] 2010-08-25 19:29:50,813 SystemTable.java (line 240) Saved Token found: 85070591730234615865843651857942052864
>  INFO [main] 2010-08-25 19:29:50,813 SystemTable.java (line 257) Saved ClusterName found:
FOO
>  INFO [main] 2010-08-25 19:29:50,813 SystemTable.java (line 272) Saved partitioner not
found. Using org.apache.cassandra.dht.RandomPartitioner
>  INFO [main] 2010-08-25 19:29:50,814 ColumnFamilyStore.java (line 422) switching in a
fresh Memtable for LocationInfo at CommitLogContext(file='/local1/junkbox/cassandra/commitlog/CommitLog-12827213897
> 70.log', position=41336)
>  INFO [main] 2010-08-25 19:29:50,814 ColumnFamilyStore.java (line 706) Enqueuing flush
of Memtable-LocationInfo@916236367(95 bytes, 2 operations)
>  INFO [FLUSH-WRITER-POOL:1] 2010-08-25 19:29:50,815 Memtable.java (line 150) Writing
Memtable-LocationInfo@916236367(95 bytes, 2 operations)
>  INFO [FLUSH-WRITER-POOL:1] 2010-08-25 19:29:50,873 Memtable.java (line 157) Completed
flushing /local1/junkbox/cassandra/data/system/LocationInfo-e-6-Data.db
>  INFO [main] 2010-08-25 19:29:50,917 StorageService.java (line 374) Starting up server
gossip
>  INFO [main] 2010-08-25 19:29:51,093 ColumnFamilyStore.java (line 1239) Loaded 0 rows
into the Super2 cache
>  INFO [main] 2010-08-25 19:29:51,170 CassandraDaemon.java (line 153) Binding thrift service
to /0.0.0.0:9160
>  INFO [main] 2010-08-25 19:29:51,174 CassandraDaemon.java (line 167) Using TFramedTransport
with a max frame size of 15728640 bytes.
>  INFO [GOSSIP_STAGE:1] 2010-08-25 19:29:51,198 Gossiper.java (line 578) Node /192.168.34.28
is now part of the cluster
>  INFO [GOSSIP_STAGE:1] 2010-08-25 19:29:51,199 Gossiper.java (line 578) Node /192.168.34.29
is now part of the cluster
>  INFO [GOSSIP_STAGE:1] 2010-08-25 19:29:51,199 Gossiper.java (line 578) Node /192.168.34.26
is now part of the cluster
>  INFO [main] 2010-08-25 19:29:51,204 CassandraDaemon.java (line 208) Listening for thrift
clients...
>  INFO [main] 2010-08-25 19:29:51,210 Mx4jTool.java (line 73) Will not load MX4J, mx4j-tools.jar
is not in the classpath
>  INFO [HINTED-HANDOFF-POOL:1] 2010-08-25 19:29:51,417 HintedHandOffManager.java (line
191) Started hinted handoff for endpoint /192.168.34.28
>  INFO [GOSSIP_STAGE:1] 2010-08-25 19:29:51,417 Gossiper.java (line 570) InetAddress /192.168.34.28
is now UP
>  INFO [HINTED-HANDOFF-POOL:1] 2010-08-25 19:29:51,418 HintedHandOffManager.java (line
247) Finished hinted handoff of 0 rows to endpoint /192.168.34.28
>  INFO [HINTED-HANDOFF-POOL:1] 2010-08-25 19:29:51,855 HintedHandOffManager.java (line
191) Started hinted handoff for endpoint /192.168.34.29
>  INFO [GOSSIP_STAGE:1] 2010-08-25 19:29:51,855 Gossiper.java (line 570) InetAddress /192.168.34.29
is now UP
>  INFO [HINTED-HANDOFF-POOL:1] 2010-08-25 19:29:51,860 HintedHandOffManager.java (line
247) Finished hinted handoff of 0 rows to endpoint /192.168.34.29
>  INFO [HINTED-HANDOFF-POOL:1] 2010-08-25 19:29:52,930 HintedHandOffManager.java (line
191) Started hinted handoff for endpoint /192.168.34.26
>  INFO [GOSSIP_STAGE:1] 2010-08-25 19:29:52,930 Gossiper.java (line 570) InetAddress /192.168.34.26
is now UP
>  INFO [HINTED-HANDOFF-POOL:1] 2010-08-25 19:29:52,930 HintedHandOffManager.java (line
247) Finished hinted handoff of 0 rows to endpoint /192.168.34.26
> I ran a repair on all the nodes and this was all that they each logged 
>  INFO [manual-repair-fe7c5abb-bb0a-4415-aa75-0d72ba4e7f1b] 2010-08-25 19:49:24,194 AntiEntropyService.java
(line 803) Waiting for repair requests to: []
> The cluster seemed OK and kept on working.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message