incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DE VITO Dominique <dominique.dev...@thalesgroup.com>
Subject RE: cost estimate about some Cassandra patchs
Date Mon, 06 May 2013 17:27:51 GMT
> De : aaron morton [mailto:aaron@thelastpickle.com]
> Envoyé : dimanche 28 avril 2013 22:54
> À : user@cassandra.apache.org
> Objet : Re: cost estimate about some Cassandra patchs
>
> > Does anyone know enough of the inner working of Cassandra to tell me how much work
is needed to patch Cassandra to enable such communication vectorization/batch ?
>

> Assuming you mean "have the coordinator send multiple row read/write requests in a single
message to replicas"
>
> Pretty sure this has been raised as a ticket before but I cannot find one now.
>
> It would be a significant change and I'm not sure how big the benefit is. To send the
messages the coordinator places them in a queue, there is little delay sending. Then it waits
on them async. So there may be some saving on networking but from the coordinators point of
view I think the impact is minimal.
>
> What is your use case?

Use case = rows with rowkey like (folder id, file id)
And operations read/write multiple rows with same folder id => so, it could make sense
to have a partitioner putting rows with same "folder id" on the same replicas.

But so far, Cassandra is not able to exploit this locality as batch effect ends at the coordinator
node.

So, my question about the cost estimate for patching Cassandra.

The closest (or exactly corresponding to my need ?) JIRA entries I have found so far are:

CASSANDRA-166: Support batch inserts for more than one key at once
https://issues.apache.org/jira/browse/CASSANDRA-166
=> "WON'T FIX" status

CASSANDRA-5034: Refactor to introduce Mutation Container in write path
https://issues.apache.org/jira/browse/CASSANDRA-5034
=> I am not very sure if it's related to my topic

Thanks.

Dominique



>
> Cheers
>
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com

On 27/04/2013, at 4:04 AM, DE VITO Dominique <dominique.devito@thalesgroup.com<mailto:dominique.devito@thalesgroup.com>>
wrote:


Hi,

We are created a new partitioner that groups some rows with **different** row keys on the
same replicas.

But neither the batch_mutate, or the multiget_slice are able to take opportunity of this partitioner-defined
placement to vectorize/batch communications between the coordinator and the replicas.

Does anyone know enough of the inner working of Cassandra to tell me how much work is needed
to patch Cassandra to enable such communication vectorization/batch ?

Thanks.

Regards,
Dominique




Mime
View raw message