incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe <watche...@gmail.com>
Subject Re: Meaning of values in tpstats
Date Mon, 12 Dec 2011 09:15:48 GMT
>
> Took me a while to figure out that // == "parallel" :)
>
Sorry, that's left over from Math classes :)


> I'm pretty sure (but not entirely, I'd have to check the code) that
> the request is forwarded as one request to the necessary node(s); what
>
Humm... hadn't even thought of that forwarding aspect.
So can someone knowledgeable confirm that a multigetslice on counter super
columns (assume CL=QUORUM):

   - gets sent in whole to the coordinator
   - the coordinator resends it as a whole to RF replicas and waits for
   QUORUM identical responses (and how does that work in a multiget ? Does it
   hash the whole multiget or key by key ?). Does this use one thread ? What
   JMX counter does it show up in ?
   - each replica runs the multiget in parallel so a multiget with a lot of
   keys will saturate the ReadStage.Active counter
   - the coordinator sends the result back.Does this use one thread ? What
   JMX counter does it show up in ?



> I was saying rather was that the individual gets get queued up as
> individual tasks to be executed internally in the different stages.
> That does lead to parallelism locally on the node (subject to the
> concurrent reader setting.
>
Ok. So in my case, average ping is 0.2ms and stable while average
multigetslice is 40ms for 512 keys at a time. So network time is accounting
for 0.5% of query time : I can really lower the batch size without it
hurting too much. And given the better results I've seen on my workload
with smaller batches, I'm going to do just that.

 Philippe

Mime
View raw message