cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yang Yang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-3073) liveSize() calculation is wrong in case of overwrite
Date Wed, 24 Aug 2011 00:47:29 GMT
liveSize() calculation is wrong in case of overwrite
----------------------------------------------------

                 Key: CASSANDRA-3073
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3073
             Project: Cassandra
          Issue Type: Bug
            Reporter: Yang Yang
            Priority: Minor


currently liveSize() is the sum of currentThroughput.

this definition is wrong if most of the operations are overwrite, or counter (which is essentially
overwrite).

for example, the following code should always keep a single entry in db, with one row, one
cf, one column, and supposedly should have a size of only about 100 bytes.
connect localhost/9160;  
create keyspace blah;
use blah;

create column family cf2 with memtable_throughput=1024 and memtable_operations=10000  ;


set the cassandra.yaml 
memtable_total_space_in_mb: 20

to make the error appear faster (but if u set to default, still same issue will appear)

then we use a simple pycassa  script:

>>> pool = pycassa.connect('blah')
>>> mycf = pycassa.ColumnFamily(pool,"cf2");
>>> for x in range(1,10000000) :
...     xx = mycf.insert('key1',{'col1':"{}".format(x)})
... 



you will see sstables being generated with only sizes of a few k, though we set the CF options
to get high SSTable sizes





--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message