incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Mazursky <ash...@gmail.com>
Subject Re: Node hang on shutdown
Date Thu, 31 Oct 2013 08:55:12 GMT
Romain,

yes, I had to kill -9 to stop it.

INFO [RequestResponseStage:54] 2013-10-31 11:59:10,413 Gossiper.java (line
789) InetAddress /192.168.0.197 is now UP
 INFO [GossipStage:1] 2013-10-31 11:59:10,706 StorageService.java (line
1298) Node /192.168.0.197 state jump to normal
 INFO [GossipStage:1] 2013-10-31 12:00:55,905 Gossiper.java (line 806)
InetAddress /192.168.0.251 is now DOWN
 INFO [HANDSHAKE-/192.168.0.251] 2013-10-31 12:01:35,848
OutboundTcpConnection.java (line 386) Handshaking version with /
192.168.0.251
 INFO [GossipStage:1] 2013-10-31 12:01:35,978 Gossiper.java (line 824) Node
/192.168.0.251 has restarted, now UP
 INFO [MemoryMeter:1] 2013-10-31 12:01:35,980 Memtable.java (line 442)
CFS(Keyspace='system', ColumnFamily='peers') liveRatio is
17.361702127659573 (just-counted was 17.361702127659573).  calculation took
0m
s for 10 columns
 INFO [HANDSHAKE-/192.168.0.251] 2013-10-31 12:01:35,989
OutboundTcpConnection.java (line 386) Handshaking version with /
192.168.0.251
 INFO [RequestResponseStage:55] 2013-10-31 12:01:36,065 Gossiper.java (line
789) InetAddress /192.168.0.251 is now UP
ERROR [MigrationStage:2] 2013-10-31 12:01:36,176 CassandraDaemon.java (line
185) Exception in thread Thread[MigrationStage:2,5,main]
java.lang.RuntimeException: java.io.FileNotFoundException:
/var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-jb-95-Index.db
(Too many open files)
        at
org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:102)
        at
org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:90)
        at
org.apache.cassandra.io.sstable.SSTableReader.openIndexReader(SSTableReader.java:1337)
        at
org.apache.cassandra.io.sstable.SSTableScanner.<init>(SSTableScanner.java:68)
        at
org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1115)
        at
org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:69)
        at
org.apache.cassandra.db.ColumnFamilyStore.getSequentialIterator(ColumnFamilyStore.java:1507)
        at
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1626)
        at
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1564)
        at
org.apache.cassandra.db.SystemKeyspace.serializedSchema(SystemKeyspace.java:722)
        at
org.apache.cassandra.db.SystemKeyspace.serializeSchema(SystemKeyspace.java:743)
        at
org.apache.cassandra.db.SystemKeyspace.serializeSchema(SystemKeyspace.java:733)
        at
org.apache.cassandra.db.MigrationRequestVerbHandler.doVerb(MigrationRequestVerbHandler.java:42)
        at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)
Caused by: java.io.FileNotFoundException:
/var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-jb-95-Index.db
(Too many open files)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
        at
org.apache.cassandra.io.util.RandomAccessReader.<init>(RandomAccessReader.java:58)
        at
org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:98)
        ... 16 more
 INFO [GossipStage:1] 2013-10-31 12:01:36,549 StorageService.java (line
1298) Node /192.168.0.251 state jump to normal
 INFO [StorageServiceShutdownHook] 2013-10-31 12:02:38,067
ThriftServer.java (line 141) Stop listening to thrift clients
 INFO [StorageServiceShutdownHook] 2013-10-31 12:02:38,174 Server.java
(line 174) Stop listening for CQL clients
 INFO [StorageServiceShutdownHook] 2013-10-31 12:02:38,175 Gossiper.java
(line 1129) Announcing shutdown
 INFO [StorageServiceShutdownHook] 2013-10-31 12:02:40,175
MessagingService.java (line 665) Waiting for messaging service to quiesce
 INFO [ACCEPT-/192.168.0.232] 2013-10-31 12:02:40,176 MessagingService.java
(line 846) MessagingService shutting down server thread.

And there are a lot of such exceptions (the above one is the last before it
hanged).
There are no exceptions after upgrade and startup.
Looks like C* is leaking file handles (nothing else is running on that VM).
I will try to debug it.

Thanks for pointing me to the logs.

Regards,
Mikhail.



2013/10/31 Romain HARDOUIN <romain.hardouin@urssaf.fr>

> Hi,
>
> So you had to kill -9 the process?
> Is there something interesting in system.log?
> Can you restart the node or are there any errors on startup?
>
> Romain
>
> Mikhail Mazursky <ash2kk@gmail.com> a écrit sur 31/10/2013 08:02:22 :
>
> > De : Mikhail Mazursky <ash2kk@gmail.com>
> > A : user@cassandra.apache.org,
> > Date : 31/10/2013 08:04
> > Objet : Node hang on shutdown
> >
> > Hi.
>
> > I was upgrading my 3 node testing cluster from 2.0.1 to 2.0.2. I
> > successfully upgraded two nodes but the last one did not shutdown
> > properly. Does somebody see anything suspicious in the attached thread
> dump?
>
> > Regards,
> > Mikhail.
>

Mime
View raw message