if you have rows like 10k and get 100 column per row, this gonna choke the cluster...been there. if you really still have to use multiget_slice, try slice your data before calling multiget_slice and check if your cluster read request pending increase... try to slow down the client sending request to the cluster if the pending going up. :)


On Tue, Dec 11, 2012 at 6:15 AM, Wei Zhu <wz1975@yahoo.com> wrote:
Well, not sure how parallel is multiget. Someone is saying it's in parallel sending requests to the different nodes and on each node it's executed sequentially. I didn't bother looking into the source code yet. Anyone knows it for sure?

I am using Hector, just copied the thrift definition from Cassandra site for reference.

You are right, the count is for each individual row.

Thanks.
-Wei 


From: "Hiller, Dean" <Dean.Hiller@nrel.gov>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>; Wei Zhu <wz1975@yahoo.com>
Sent: Monday, December 10, 2012 1:13 PM
Subject: Re: multiget_slice SlicePredicate

What's wrong with multiget…parallel performance is great from multiple disks and so usually that is a good thing.

Also, something looks wrong, since you have list<binary> keys, I would expect the Map to be Map<binary, list<ColumnOrSuperColumn>>

Are you sure you have that correct?  IF you set range to 100, it should be 100 columns each row but it never hurts to run the code and verify.

Later,
Dean
PlayOrm Developer


From: Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>, Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com>>
Date: Monday, December 10, 2012 2:07 PM
To: Cassandr usergroup <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: multiget_slice SlicePredicate

I know it's probably not a good idea to use multiget, but for my use case, it's the only choice,

I have question regarding the SlicePredicate argument of the multiget_slice


The SlicePredicate takes slice_range which takes start, end and range. I suppose start and end will apply to each individual row. How about range, is it a accumulative column count of all the rows or to the individual row?
If I set range to 100, is it 100 columns per row, or total?

Thanks for you reply,
-Wei

multiget_slice

*
map<string,list<ColumnOrSuperColumn>> multiget_slice(list<binary> keys, ColumnParent column_parent, SlicePredicate predicate, ConsistencyLevel consistency_level)