cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Graham Sanderson <gra...@vast.com>
Subject Re: Does SELECT … IN () use parallel dispatch?
Date Fri, 25 Jul 2014 18:40:06 GMT
Of course the driver in question is allowed to be smarter and can do so if use use a ? parameter
for a list or even individual elements

I'm not sure which if any drivers currently do this but we plan to combine this with token
aware routing in our scala driver in the future 

Sent from my iPhone

> On Jul 25, 2014, at 1:14 PM, DuyHai Doan <doanduyhai@gmail.com> wrote:
> 
> Nope. Select ... IN() sends one request to a coordinator. This coordinator dispatch the
request to 50 nodes as in your example and waits for 50 responses before sending back the
final result. As you can guess this approach is not optimal since the global request latency
is bound to the slowest latency among 50 nodes.
> 
>  On the other hand if you use async feature from the native protocol, you client will
issue 50 requests in parallel and the answers arrive as soon as they are fetched from different
nodes.
> 
>  Clearly the only advantage of using IN() clause is ease of query. I would advise to
use IN() only when you have a "few" values, not 50.
> 
> 
>> On Fri, Jul 25, 2014 at 8:08 PM, Kevin Burton <burton@spinn3r.com> wrote:
>> Say I have about 50 primary keys I need to fetch.
>> 
>> I'd like to use parallel dispatch.  So that if I have 50 hosts, and each has record,
I can read from all 50 at once.
>> 
>> I assume cassandra does the right thing here ?  I believe it does… at least from
reading the docs but it's still a bit unclear.
>> 
>> Kevin
>> 
>> -- 
>> Founder/CEO Spinn3r.com
>> Location: San Francisco, CA
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> 
> 

Mime
View raw message