On Thu, Mar 25, 2010 at 15:17, Sylvain Lebresne <sylvain@yakaz.com> wrote:
I don't know If that could play any role, but if ever you have
disabled the assertions
when running cassandra (that is, you removed the -ea line in
cassandra.in.sh), there
was a bug in 0.6beta2 that will make read in row with lots of columns
quite slow.

We tried it with beta3 and got the same results, so that didn't do anything.
 
Another problem you may have is if you have the commitLog directory on the same
hard drive than the data directory. If that's the case and you read
and write at the
same time, that may be a reason for poor read performances (and write too).

We also tested doing only reads, and got about the same read speeds
 
As for the row with 30 millions columns, you have to be aware that right now,
cassandra will deserialize whole rows during compaction
(http://wiki.apache.org/cassandra/CassandraLimitations).
So depending on the size of what you store in you column, you could
very well hit
that limitation (that could be why you OOM). In which case, I see two choices:
1) add more RAM to the machine or 2) change your data structure to
avoid that (maybe
can you split rows with too many columns somehow ?).

Splitting the rows would be an option if we got anything near decent speed for small rows, but even if we only have a few hundred thousand columns in one row, the read speed is still slow.

What kind of numbers are common for this type of operation? Say that you have a row with 500000 columns whose names range from 0x0 to 0x7A120, and you do get_slice operations on that with ranges of random numbers in the interval but with a fixed count of 1000, and that you multithread it with ~10 of threads, can't you get more than 50 reads/s?

When we've been reading up on Cassandra we've seen posts that billions of columns in a row shouldn't be a problem, and sure enough, writing all that data goes pretty fast, but as soon as you want to retrieve it, it is really slow. We also tried doing counts on the number of columns in a row, and that was really, really slow, it took half a minute to count the columns in a row with 500000 columns, and when doing the same on a row with millions, it just crashed with an OOM exception after a few minutes.


/Henrik