cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Constance Eustace (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-6107) CQL3 Batch statement memory leak
Date Fri, 27 Sep 2013 15:20:03 GMT
Constance Eustace created CASSANDRA-6107:
--------------------------------------------

             Summary: CQL3 Batch statement memory leak
                 Key: CASSANDRA-6107
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
             Project: Cassandra
          Issue Type: Bug
          Components: API, Core
         Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
- Running on OSX MacbookPro
- Sun JVM 1.7
- Single local cassandra node
- both CMS and G1 GC used
- we are using the cass-JDBC driver to submit our batches


            Reporter: Constance Eustace
            Priority: Critical


We are doing large volume insert/update tests on a CASS via CQL3. 


Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, we run out of
heap, and it never dissipates, and we begin getting this infamous error which many people
seem to be encountering:

WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) Heap is 0.9383457210434385
full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to
the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold
in cassandra.yaml if you don't want Cassandra to do this automatically
 INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 3614) Unable to
reduce heap usage since there are no dirty column families


8 and 12 GB heaps appear to delay the problem by roughly proportionate amounts of 75,000 -
100,000 rowkeys per 4GB. Each run of 50,000 row key creations sees the heap grow and never
shrink again. 

We have attempted to no effect:
- removing all secondary indexes to see if that alleviates overuse of bloom filters 
- adjusted parameters for compaction throughput
- adjusted memtable flush thresholds and other parameters 

By examining heapdumps, it seems apparent that the problem is perpetual retention of CQL3
statements. We have even tried dropping the keyspaces after the updates and the CQL3 statement
are still visible in the heapdump, and after many many many CMS GC runs. G1 also showed this
issue.

The 750,000 statements are broken into batches of roughly 200 statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message