Have you tried using a super column, it seems that having a row with over 100K columns and growing would be alot for cassandra to deserialize?  what is iostat and jmeter telling you? it would be interesting to see that data.  also what are you using for you key or row caching?  do you need to use a quorum consistency as that can slow down reads as well, can you use a lower consistency level?

On Tue, Aug 24, 2010 at 9:14 PM, B. Todd Burruss <bburruss@real.com> wrote:
i am using get_slice to pull columns from a row to emulate a queue.  column names are TimeUUID and the values are small, < 32 bytes.  simple ColumnFamily.

i am using SlicePredicate like this to pull the first ("oldest") column in the row:

       SlicePredicate predicate = new SlicePredicate();
       predicate.setSlice_range(new SliceRange(new byte[] {}, new byte[] {}, false, 1));

       get_slice(rowKey, colParent, predicate, QUORUM);

once i get the column i remove it.  so there are a lot of gets and mutates, leaving lots of deleted columns.

get_slice starts off performing just fine, but then falls off dramatically as the number of columns grows.  at its peak there are 100,000 columns and get_slice is taking over 100ms to return.

i am running a single instance of cassandra 0.7 on localhost, default config.  i've done some googling and can't find any tweaks or tuning suggestions specific to get_slice.  i already know about separating commitlog and data, watching iostat, GC, etc.

any low hanging tuning fruit anyone can think of?  in 0.6 i recall an index for columns, maybe that is what i need?