cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <>
Subject Re: architectural understanding of write operation node flow
Date Tue, 24 Jan 2012 09:30:55 GMT
On Tue, Jan 24, 2012 at 9:57 AM, Peter Dijkshoorn
<> wrote:
> yeah, well main question remains then, is the node receiving the request
> from the client called the coordinator (even if it is not responsible
> for that key)?


> Or will that node forward the call to the first responsible node who
> does the coordinating stuff? (as the cassandra and dynamo paper state)
> In case of that forwarding, is the client told to connect to another
> node, or does the node receiving the call act as a proxy?
> so, is it a 3-deep or a 2-deep network call?

2-deep (except for counter increments which have a slightly different protocol).

> I want to explain this in a small literature part of my thesis to
> distinguish the internal structure against BigTable and various kinds of
> RDBMS replication schemes, that's why I want to know precisely :)
> Thanks,
> Peter Dijkshoorn
> Adyen - Payments Made Easy
> Visiting Address:                 Mail Address:
> Stationsplein 57 - 4th floor      P.O. Box 10095
> 1012 AB Amsterdam                 1001 EB Amsterdam
> The Netherlands                   The Netherlands
> Office +
> Email
> On 01/23/2012 06:59 PM, Daniel Doubleday wrote:
>> Ouch :-) you were asking write ...
>> Well kind of similar
>> 1. Coordinator calculates all nodes
>> 2. If not enough (according to CL) nodes are alive it throughs unavailable
>> 3. If nodes are down it writes and hh is enabled it writes a hint for that row
>> 4. It sends write request to all nodes (including itself / shortcutting messaging)
>> 5. If it receives enough (according to CL) acks before timeout everything is fine
otherwise it throughs unavailable
>> errm .. I'm more confident in the read path though especially concerning hh handling
so I'm happy to be corrected here. I.e. I'm not sure if hints are written when request time
out but CL is reached.
>> On Jan 23, 2012, at 6:47 PM, Daniel Doubleday wrote:
>>> Your first thought was pretty much correct:
>>> 1. The node which is called by the client is the coordinator
>>> 2. The coordinator determines the nodes in the ring which can handle the request
ordered by expected latency (via snitch). The coordinator may or may not be part of these
>>> 3. Given the consistency level and read repair chance the coordinator calculates
the min amount of node to ask and sends read requests to them
>>> 4. As soon as the minimum count (according to consistency) of responses is collected
the coordinator will respond to the request. Mismatches will lead to repair write requests
to the corresponding nodes
>>> Thus the minimal depth is one (CL = 1 and coordinator can handle the request
itself) or two otherwise.
>>> Hope that helps
>>> On Jan 23, 2012, at 4:47 PM, Peter Dijkshoorn wrote:
>>>> Hi guys,
>>>> I got an architectural question about how a write operation flows
>>>> through the nodes.
>>>> As far as I understand now, a client sends its write operation to
>>>> whatever node it was set to use and if that node does not contain the
>>>> data for this key K, then this node forwards the operation to the first
>>>> node given by the hash function. This first node having key K then
>>>> contacts the replication nodes depending on the selected consistency level.
>>>> This means that in the unlucky event you always have a network call
>>>> sequence depth of 2 (consistency level one), or 3 (assumed that the
>>>> replication nodes are contacted in parallel)
>>>> This is more than I expected, so I am not sure whether this is correct?
>>>> can someone help me out?
>>>> At first I thought that the receiver was the coordinator, and thus doing
>>>> all further calls in parallel, the depth as described above would always
>>>> be 2. But I just discovered that I was wrong and that it should be
>>>> something like above.
>>>> Another possibility would be that the client learnt the layout of the
>>>> cluster at connection time and thereby tries per request to contact the
>>>> coordinator directly, but I never read or see something like this happening.
>>>> Remembering the picture of Dean about network and hard disk latencies,
>>>> is this 3-sequential-network-call still faster?
>>>> Thanks for any thoughts :)
>>>> Peter
>>>> --
>>>> Peter Dijkshoorn
>>>> Adyen - Payments Made Easy
>>>> Visiting Address:                 Mail Address:
>>>> Stationsplein 57 - 4th floor      P.O. Box 10095
>>>> 1012 AB Amsterdam                 1001 EB Amsterdam
>>>> The Netherlands                   The Netherlands
>>>> Office +
>>>> Email

View raw message