cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir (JIRA)" <>
Subject [jira] [Created] (CASSANDRA-9120) OutOfMemoryError when read auto-saved cache (probably broken)
Date Sat, 04 Apr 2015 07:50:33 GMT
Vladimir created CASSANDRA-9120:

             Summary: OutOfMemoryError when read auto-saved cache (probably broken)
                 Key: CASSANDRA-9120
             Project: Cassandra
          Issue Type: Bug
         Environment: Linux
            Reporter: Vladimir
             Fix For: 2.0.14

Found during tests on a 100 nodes cluster. After restart I found that one node constantly
crashes with OutOfMemory Exception. I guess that auto-saved cache was corrupted and Cassandra
can't recognize it. I see that similar issues was already fixed (when negative size of some
structure was read). Does auto-saved cache have checksum? it'd help to reject corrupted cache
at the very beginning.

As far as I can see current code still have that problem. Stack trace is:

INFO [main] 2015-03-28 01:04:13,503 (line 114) reading saved cache /storage/core/loginsight/cidata/cassandra/saved_caches/system-sstable_activity-KeyCache-b.db
ERROR [main] 2015-03-28 01:04:14,718 (line 513) Exception encountered
during startup
java.lang.OutOfMemoryError: Java heap space
        at java.util.ArrayList.<init>(Unknown Source)
        at org.apache.cassandra.db.RowIndexEntry$Serializer.deserialize(
        at org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(
        at org.apache.cassandra.cache.AutoSavingCache.loadSaved(
        at org.apache.cassandra.db.ColumnFamilyStore.<init>(
        at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(
        at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(
        at org.apache.cassandra.db.Keyspace.initCf(
        at org.apache.cassandra.db.Keyspace.<init>(
        at org.apache.cassandra.db.SystemKeyspace.checkHealth(
        at org.apache.cassandra.service.CassandraDaemon.setup(
        at org.apache.cassandra.service.CassandraDaemon.activate(
        at org.apache.cassandra.service.CassandraDaemon.main(

I looked at source code of Cassandra and see:

119 int entries = in.readInt();
120 List<IndexHelper.IndexInfo> columnsIndex = new ArrayList<IndexHelper.IndexInfo>(entries);

It seems that value entries is invalid (negative) and it tries too allocate an array with
huge initial capacity and hits OOM. I have deleted saved_cache directory and was able to start
node correctly. We should expect that it may happen in real world. Cassandra should be able
to skip incorrect cached data and run.

This message was sent by Atlassian JIRA

View raw message