cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Freeman <>
Subject insert performance (1.2.8)
Date Mon, 19 Aug 2013 22:49:44 GMT
I've got a 3-node cassandra cluster (16G/4-core VMs ESXi v5 on 2.5Ghz 
machines not shared with any other VMs).  I'm inserting time-series data 
into a single column-family using "wide rows" (timeuuids) and have a 
3-part partition key so my primary key is something like ((a, b, day), 
in-time-uuid), x, y, z).

My java client is feeding rows (about 1k of raw data size each) in 
batches using multiple threads, and the fastest I can get it run 
reliably is about 2000 rows/second.  Even at that speed, all 3 cassandra 
nodes are very CPU bound, with loads of 6-9 each (and the client machine 
is hardly breaking a sweat).  I've tried turning off compression in my 
table which reduced the loads slightly but not much.  There are no other 
updates or reads occurring, except the datastax opscenter.

I was expecting to be able to insert at least 10k rows/second with this 
configuration, and after a lot of reading of docs, blogs, and google, 
can't really figure out what's slowing my client down.  When I increase 
the insert speed of my client beyond 2000/second, the server responses 
are just too slow and the client falls behind.  I had a single-node 
Mysql database that can handle 10k of these data rows/second, so I 
really feel like I'm missing something in Cassandra.  Any ideas?

View raw message