incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: multiget_slice SlicePredicate
Date Tue, 11 Dec 2012 20:17:49 GMT
I tend to caution against making very large batch mutations or multi gets, by which I mean
100's of rows at a time. 

Each row request becomes a task and they can temporarily fill the mutation or read thread
pool. Meaning overall *client* request throughout drops while a big request is chewed through.
 

This this is more of an issue with smaller clusters. As Dean says, the client request is performed
in parallel on multiple machines. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 12/12/2012, at 3:03 AM, "Hiller, Dean" <Dean.Hiller@nrel.gov> wrote:

> Each node is doing it's thing in parallel….they on purpose do NOT co-ordinate as they
do not need to so each one is doing it's scan on the rows it has individually.
> 
> If all rows "happen" to be on the same server, sure some may be done sequentially depending
on number of rows vs. thread pool size.
> 
> As far as a single row is concerned, I know mutations to a single row are serialised
as Aaron has said as much but you are talking about multiple rows here.
> 
> Later,
> Dean
> 
> From: Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com>>
> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>,
Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com>>
> Date: Monday, December 10, 2012 3:15 PM
> To: Cassandr usergroup <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Subject: Re: multiget_slice SlicePredicate
> 
> Well, not sure how parallel is multiget. Someone is saying it's in parallel sending requests
to the different nodes and on each node it's executed sequentially. I didn't bother looking
into the source code yet. Anyone knows it for sure?
> 
> I am using Hector, just copied the thrift definition from Cassandra site for reference.
> 
> You are right, the count is for each individual row.
> 
> Thanks.
> -Wei
> 
> ________________________________
> From: "Hiller, Dean" <Dean.Hiller@nrel.gov<mailto:Dean.Hiller@nrel.gov>>
> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>;
Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com>>
> Sent: Monday, December 10, 2012 1:13 PM
> Subject: Re: multiget_slice SlicePredicate
> 
> What's wrong with multiget…parallel performance is great from multiple disks and so
usually that is a good thing.
> 
> Also, something looks wrong, since you have list<binary> keys, I would expect the
Map to be Map<binary, list<ColumnOrSuperColumn>>
> 
> Are you sure you have that correct?  IF you set range to 100, it should be 100 columns
each row but it never hurts to run the code and verify.
> 
> Later,
> Dean
> PlayOrm Developer
> 
> 
> From: Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com><mailto:wz1975@yahoo.com<mailto:wz1975@yahoo.com>>>
> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
<user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>,
Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com><mailto:wz1975@yahoo.com<mailto:wz1975@yahoo.com>>>
> Date: Monday, December 10, 2012 2:07 PM
> To: Cassandr usergroup <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
> Subject: multiget_slice SlicePredicate
> 
> I know it's probably not a good idea to use multiget, but for my use case, it's the only
choice,
> 
> I have question regarding the SlicePredicate argument of the multiget_slice
> 
> 
> The SlicePredicate takes slice_range which takes start, end and range. I suppose start
and end will apply to each individual row. How about range, is it a accumulative column count
of all the rows or to the individual row?
> If I set range to 100, is it 100 columns per row, or total?
> 
> Thanks for you reply,
> -Wei
> 
> multiget_slice
> 
> *
> map<string,list<ColumnOrSuperColumn>> multiget_slice(list<binary> keys,
ColumnParent column_parent, SlicePredicate predicate, ConsistencyLevel consistency_level)
> 
> 
> 
> 


Mime
View raw message