incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xu Zhongxing" <xu_zhong_x...@163.com>
Subject Re: CQLSSTableWriter memory leak
Date Fri, 06 Jun 2014 07:50:12 GMT
We figured out the reason for the growing memory usage. When adding rows, if flush-to-disk
operation is done in SStableSimpleUnsortedWriter.newRow(). But for the compound primary key
case, when the clustering key is identical, there is no new row created. So the single huge
row is kept in the memory and no disk sync() is done.






在 2014-06-06 00:16:13,"Jack Krupansky" <jack@basetechnology.com> 写道:

How many rows (primary key values) are you writing for each partition of the primary key?
I mean, are there relatively few, or are these very wide partitions?
 
Oh, I see! You’re writing 50,000,000 rows to a single partition! My, that IS ambitious.
 
-- Jack Krupansky
 
From:Xu Zhongxing
Sent: Thursday, June 5, 2014 3:34 AM
To:user@cassandra.apache.org
Subject: CQLSSTableWriter memory leak
 

I am using Cassandra's CQLSSTableWriter to import a large amount of data into Cassandra. When
I use CQLSSTableWriter to write to a table with compound primary key, the memory consumption
keeps growing. The GC of JVM cannot collect any used memory. When writing to tables with no
compound primary key, the JVM GC works fine.

My Cassandra version is 2.0.5. The OS is Ubuntu 14.04 x86-64. JVM parameters are -Xms1g -Xmx2g.
This is sufficient for all other non-compound primary key cases.

The problem can be reproduced by the following test case:

import org.apache.cassandra.io.sstable.CQLSSTableWriter;
import org.apache.cassandra.exceptions.InvalidRequestException;

import java.io.IOException;
import java.util.UUID;

class SS {
    public static void main(String[] args) {
        String schema = "create table test.t (x uuid, y uuid, primary key (x, y))";


        String insert = "insert into test.t (x, y) values (?, ?)";
        CQLSSTableWriter writer = CQLSSTableWriter.builder()
            .inDirectory("/tmp/test/t")
            .forTable(schema).withBufferSizeInMB(32)
            .using(insert).build();

        UUID id = UUID.randomUUID();
        try {
            for (int i = 0; i < 50000000; i++) {
                UUID id2 = UUID.randomUUID();
                writer.addRow(id, id2);
            }

            writer.close();
        } catch (Exception e) {
            System.err.println("hell");
        }
    }
}
Mime
View raw message