I tend to caution against making very large batch mutations or multi gets, by which I mean 100's of rows at a time. 

Each row request becomes a task and they can temporarily fill the mutation or read thread pool. Meaning overall *client* request throughout drops while a big request is chewed through.  

This this is more of an issue with smaller clusters. As Dean says, the client request is performed in parallel on multiple machines. 


Aaron Morton
Freelance Cassandra Developer
New Zealand


On 12/12/2012, at 3:03 AM, "Hiller, Dean" <Dean.Hiller@nrel.gov> wrote:

Each node is doing it's thing in parallel….they on purpose do NOT co-ordinate as they do not need to so each one is doing it's scan on the rows it has individually.

If all rows "happen" to be on the same server, sure some may be done sequentially depending on number of rows vs. thread pool size.

As far as a single row is concerned, I know mutations to a single row are serialised as Aaron has said as much but you are talking about multiple rows here.


From: Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>, Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com>>
Date: Monday, December 10, 2012 3:15 PM
To: Cassandr usergroup <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: multiget_slice SlicePredicate

Well, not sure how parallel is multiget. Someone is saying it's in parallel sending requests to the different nodes and on each node it's executed sequentially. I didn't bother looking into the source code yet. Anyone knows it for sure?

I am using Hector, just copied the thrift definition from Cassandra site for reference.

You are right, the count is for each individual row.


From: "Hiller, Dean" <Dean.Hiller@nrel.gov<mailto:Dean.Hiller@nrel.gov>>
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>; Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com>>
Sent: Monday, December 10, 2012 1:13 PM
Subject: Re: multiget_slice SlicePredicate

What's wrong with multiget…parallel performance is great from multiple disks and so usually that is a good thing.

Also, something looks wrong, since you have list<binary> keys, I would expect the Map to be Map<binary, list<ColumnOrSuperColumn>>

Are you sure you have that correct?  IF you set range to 100, it should be 100 columns each row but it never hurts to run the code and verify.

PlayOrm Developer

From: Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com><mailto:wz1975@yahoo.com<mailto:wz1975@yahoo.com>>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>, Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com><mailto:wz1975@yahoo.com<mailto:wz1975@yahoo.com>>>
Date: Monday, December 10, 2012 2:07 PM
To: Cassandr usergroup <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
Subject: multiget_slice SlicePredicate

I know it's probably not a good idea to use multiget, but for my use case, it's the only choice,

I have question regarding the SlicePredicate argument of the multiget_slice

The SlicePredicate takes slice_range which takes start, end and range. I suppose start and end will apply to each individual row. How about range, is it a accumulative column count of all the rows or to the individual row?
If I set range to 100, is it 100 columns per row, or total?

Thanks for you reply,


map<string,list<ColumnOrSuperColumn>> multiget_slice(list<binary> keys, ColumnParent column_parent, SlicePredicate predicate, ConsistencyLevel consistency_level)