incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From malcolm smith <malsm...@treehousesystems.com>
Subject Newbie Performance Question
Date Fri, 26 Mar 2010 15:45:04 GMT
I've been getting a feel for the performance elements of Cassandra using
version 0.51.  I've done similar tests on HBase before, but Cassandra has
some very appealing aspects that I would like to pursue.

However I'm not seeing the what seems like the common level of performance
others are seeing.

Perf summary:

My test program inserts 100K 5 character strings with 10 bytes of value data
in a single row / column family.  The column family is raw byte sorted.

Single thread inserts - yields 277 inserts per second with ZERO consistency
level (or 3.61 milliseconds per insert)
Single thread inserts - yields 207 inserts per second with ONE consistency
level (or 4.83 milliseconds per insert)

With 5 threads (actually 5 processes running simultaneously inserting to 5
different top level key values)
Five thread inserts    - yields  94 inserts per second with ONE consistency
level (or 11 milliseconds per insert)

I see people on this maillist with 3,000 or more inserts per second so it
seems like I'm off by an order of magnitude or more.

Also a similar test on HBase with a single thread gets me 3,333 inserts per
second on the same laptop machine.


Background:  I'm running the standalone (single node) on 2 core 64-bit Dell
laptop - runs Ubuntu 9.10 / 2.6.31-20-generic with 8GB RAM and 240Gb SSD
disk drive.   (See java setup below).

Sun 6 Java VM -
-Xdebug -Xms512M -Xmx1G -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=90
-XX:+AggressiveOpts
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=1 -XX:+CMSParallelRemarkEnabled
-XX:+HeapDumpOnOutOfMemoryError -Dcom.sun.management.jmxremote.port=8085
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dstorage-config=/etc/cassandra -Dcassandra-foreground=yes

I've used the default storage-conf.xml but changed these values:
 <FlushDataBufferSizeInMB>320</FlushDataBufferSizeInMB>
 <FlushIndexBufferSizeInMB>80</FlushIndexBufferSizeInMB>
<MemtableSizeInMB>128</MemtableSizeInMB>
<MemtableObjectCountInMillions>0.5</MemtableObjectCountInMillions>
<MemtableFlushAfterMinutes>60</MemtableFlushAfterMinutes>
<ConcurrentReads>8</ConcurrentReads>

This is the perl code I'm using for the test.  Note that the timestamps and
the values are pre-calculated in another loop to try to isolate the
cassandra elements from everything else.

$key = "testrow" . $procID;

$client->insert(
'Keyspace1',
$key,
Net::Cassandra::Backend::ColumnPath->new({ column_family =>
'Super1',super_column => 'test-super7', column => $i }),
$data{$i}->{'val'},
$data{$i}->{'time'},
Net::Cassandra::Backend::ConsistencyLevel::ONE
);



Thanks in advance for your help.

Mime
View raw message