incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Per Olesen <>
Subject Re: Are 6..8 seconds to read 23.000 small rows - as it should be?
Date Fri, 04 Jun 2010 18:20:04 GMT

On Jun 4, 2010, at 5:19 PM, Ben Browning wrote:

> How many subcolumns are in each supercolumn and how large are the
> values? Your example shows 8 subcolumns, but I didn't know if that was
> the actual number. I've been able to read columns out of Cassandra at
> an order of magnitude higher than what you're seeing here but there
> are too many variables to directly compare.

There are very few columns for each SC. About 8, but it varies a bit. The column names and
values are pretty small. around 20-30 bytes for each column, I guess. So, we are talking small
amounts of data here.

Yes, I know there are too many variables, but I have the feeling - as you also write - that
the performance of this simple thing should be orders of magnitude better. 

So, how might I go about trying to find out why this takes so long time in my specific setup?
Can I get timings of stuff inside cassandra itself?

> Keep in mind that the results from each thrift call has to fit into
> memory - you might be better off paging through the 23000 columns,
> reading a few thousand at a time.

Yes, I know. And I might end up doing this in the end. I do though have pretty hard upper
limits of how many rows I will end up with for each key, but anyways it might be a good idea
none the less. Thanks for the advice on that one.


View raw message