Thanks for the confirmation. Interesting alternatives to avoid random coordinator. 
Are there any blogs/writeups of they (primary node as co-ordinator) been used in production scenarios. I googled but could not find anything relevant.

On Wed, Feb 16, 2011 at 3:25 AM, Oleg Anastasyev <> wrote:
A J <s5alye <at>> writes:

> Makes sense ! Thanks.
> Just a quick follow-up:
> Now I understand the write is not made to coordinator (unless it is part of
the replica for that key). But does the write column traffic 'flow' through the
coordinator node. For a 2G column write, will I see 2G network traffic on the
coordinator node  or just a few bytes of traffic on the co-ordinator of it
reading the key and talking to nodes/client etc ?

Yes, if you talk to random (AKA coordinator) node first - all 2G traffic will
flow to it first and then forwarded to natural nodes (those owning replicas of a
row to be written).
If you want to avoid extra traffic, you should determine natural nodes of the
row and send your write directly to one of natural nodes (i.e. one of natural
nodes became coordinator). This natural coordinator node will accept write
locally and submit write to other replicas in parallel.
If your client is written in java this can be implemented relatively easy. Look
at TokenMetadata.ringIterator().

If you have no requirement on using thrift interface of cassandra, it could be
more efficient to write using StorageProxy interface. The latter plays a local
coordinator role, so it talks directly to all replicas, so these 2G will be
passed directly from your client to all row replicas.

> This will be a factor for us. So need to make sure exactly.