cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Jones <>
Subject RE: How to increase cassandra's performance in read?
Date Tue, 20 Apr 2010 14:50:00 GMT
I too am seeing very slow performance while testing worst case scenarios of 1 key leading to
1 supercolumn and 1 column beyond that.

Key -> SuperColumn -> 1 Column (of ~ 500 bytes)

Drive utilization is 80-90% and I'm only dealing with 50-70 million rows.  (With NO swapping)
 So far, I've found nothing that helps, including increasing the keycache FROM 200k-500k keys,
I'm guessing the hashing prevents better cache performance.

Read performance is definitely not 3 IOs based on the utilization factors on my drives.  I'm
not sure the issue was ever settled in the previous e-mails as to how to calculate how many
IOs were being done for each read.  I've been testing with clusters of 1,2,3 or 4 machines
and so far all I'm seeing with multiple machines, is lower performance in a cluster than alone.
 I keep assuming that at some number of nodes, the performance will begin to pick up.  Three
of my nodes are running with 8GB (6GB Java Heap), and one has 4GB (3GB Java Heap).  The machine
with the smallest memory footprint is the fastest performer on inserts, but definitely not
the fastest on reads.

I'm suspecting the read path is relying heavily on the fact that you want to get many columns
that are closely related, because lookup by key appears to be incredibly slow.

From: yangfeng []
Sent: Tuesday, April 20, 2010 7:59 AM
Subject: How to increase cassandra's performance in read?

I  get 10 columns Family by keys and  one columns Family has 30 columns.
I use multigetSlice once to get 10 column Family.but the performance is so poor.
anyone has other  thought to increase the performance.

View raw message