Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 15262 invoked from network); 3 Sep 2010 06:50:45 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 3 Sep 2010 06:50:45 -0000 Received: (qmail 7368 invoked by uid 500); 3 Sep 2010 06:50:43 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 7302 invoked by uid 500); 3 Sep 2010 06:50:40 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 7294 invoked by uid 99); 3 Sep 2010 06:50:39 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Sep 2010 06:50:38 +0000 X-ASF-Spam-Status: No, hits=4.4 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ivytang0812@gmail.com designates 209.85.214.172 as permitted sender) Received: from [209.85.214.172] (HELO mail-iw0-f172.google.com) (209.85.214.172) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Sep 2010 06:50:16 +0000 Received: by iwn3 with SMTP id 3so1404136iwn.31 for ; Thu, 02 Sep 2010 23:49:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=oeKdjmD6YFAmm/oCK3hU7YIdgqi2UqU96AYbFme4w5o=; b=murSBabESozejKjKPvaUanRmNyjlaycWSuU+0HO5ih8F5ofQFg84dYjI52NWo/nNvV 4q42bBc7kQ3wguB3Y3xbMGkScNeXmXEQkDkH5ejzHYLMuqIUgpFM79wc9fs3DTLtqCQp OyN4knZsSq/KDEOuEWm4s83HAqkyP/7XMKvaA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=EZKGR4kKiBBdxwSvg1+UWSkESOmICzmD5vAdzJWPgPYw3Djkb4wRAh7akXY7Ejeyzw 1cdZPxP0aCrrT5H4Adtm4nMaZuEDWrCifIH6HKXTZVN36BWDDHL+Q5Xw944ZP5YTcG1R wCgV8dLIYhaZPn2VAB9M4YIzAHR5HjTzrKwTM= Received: by 10.231.152.78 with SMTP id f14mr482191ibw.60.1283496595187; Thu, 02 Sep 2010 23:49:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.160.3 with HTTP; Thu, 2 Sep 2010 23:49:35 -0700 (PDT) In-Reply-To: <9397eb95-c0c0-efd2-df26-cc6423a371a0@me.com> References: <9397eb95-c0c0-efd2-df26-cc6423a371a0@me.com> From: Ying Tang Date: Fri, 3 Sep 2010 14:49:35 +0800 Message-ID: Subject: Re: the process of reading and writing To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=005045014644a067c8048f5556be X-Virus-Checked: Checked by ClamAV on apache.org --005045014644a067c8048f5556be Content-Type: text/plain; charset=ISO-8859-1 Hi Aaron Thanks for your reply. In you text , does the coordinator means the random client that user send request to ? Do you mean no matter how many W is assigned to , the data will copy on N node ? Just the client will think this write action is successful when W nodes are be written ? Ps. The key coordinator doesn't mean a single node that is responsible for all nodes's key range . The key coordinator is the primary node that is responsible for a key range . If a key is in its range , this node will be this key's coordinator. On Fri, Sep 3, 2010 at 2:36 PM, Aaron Morton wrote: > AKAIK, > For read the coordinator sends the request to the number of nodes specified > in the RF. RR is kicked off on the coordinator node after the read has > completed. There is no key coordinator, what would you do if it as down ? > The first node in the list of replication nodes is considered special, but > not that special. (In a normal read only the first node is asked for the > data, others nodes are asked for a digest) > > write same as read. One hop from the coordinator node to the nodes that > will do the write. The one hop part is discussed in the paper. > > N is the number of copies of the data that will be stored. W is the > consistency level the client is happy to accept to say that the write has > succeed, after W have ack'd to the coordinator it will ack to the client. > But it's more complicated that that, search the archives for a big > discussion on Handed Hint Off > > If you client always operates such that R+W>N you have consistency. If you > drop the R down to 1 you may read data that is not consitent with the other > nodes in the ring, because the coordinator returns to as soon as the first > node does. It will then look at the result from the other nodes and kick off > the Read Repair is needed. But this is after your read request has > completed. > > Aaron > > > > On 03 Sep, 2010,at 03:19 PM, Ying Tang wrote: > > Recently , i read the paper about Cassandra again . > And now i have some concepts about the reading and writing . > > We all know Cassandra uses NWR , > When read : > the request ---> a random node in Cassandra .This node acts as a proxy ,and > it routes the request. > Here , > 1. the proxy node route this request to this key's coordinator , the > coordinator then routes request to other N-1 nodes OR the proxy routes > the read request to N nodes ? > 2. If it is the former situation , the read repair occurs on the key's > coordinator ? > If it is the latter , the read repair occurs on the proxy node ? > > When write : > the request ---> a random node in Cassandra .This node acts as a proxy ,and > it routes the request. > Here , > 3. the proxy node route this request to this key's coordinator , the > coordinator then routes request to other N-1 nodes OR the proxy routes > the request to N nodes ? > > > 4. The N isn't the data's copy numbers , it's just a range . In this N > range , there must be W copies .So W is the copy numbers. > So in this N range , R+W>N can guarantee the data's validity. Right? > > > > > -- > Best regards, > > Ivy Tang > > > > -- Best regards, Ivy Tang --005045014644a067c8048f5556be Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Aaron
Thanks for your reply.

In you text , = does the coordinator means the random client that user send request to ?=A0=
Do you mean no matter how many W is assigned to , the data will = copy on N node ? Just the client will think this write action is successful= when W nodes are be written ?

Ps. The key coordinator doesn't mean a single node = that is responsible for all nodes's key range . The key coordinator is = the primary node that is responsible for a key range . If a key is in its r= ange , this node will be this key's coordinator.


On Fri, Sep 3, 2010 at 2= :36 PM, Aaron Morton <aaron@thelastpickle.com> wrote:
AKAIK,=A0
For read the coordinator sends the request to= the number of nodes specified in the RF. RR is kicked off on the=A0coordin= ator=A0node after the read has completed. There is no key=A0coordinator, wh= at would you do if it as down ? The first node in the list of replication n= odes is considered special, but not that special. (In a normal read only th= e first node is asked for the data, others nodes are asked for a digest)

write same as read. One hop from the coordinator node t= o the nodes that will do the write. The one hop part is discussed in the pa= per.

N is the number of copies of the data that wi= ll be stored. W is the consistency level the client is happy to accept to s= ay that the write has succeed, after W have ack'd to the=A0coordinator= =A0it will ack to the client. But it's more complicated that that, sear= ch the archives for a big discussion on Handed Hint Off=A0

If you client always operates such that R+W>N you ha= ve=A0consistency. If you drop the R down to 1 you may read data that is not= consitent with the other nodes in the ring, because the coordinator return= s to as soon as the first node does. It will then look at the result from t= he other nodes and kick off the Read Repair is needed. But this is after yo= ur read request has completed.=A0

Aaron

=A0=A0

On 03 Sep, 2010,at 0= 3:19 PM, Ying Tang <ivytang0812@gmail.com> wrote:

Recently , i read the paper a= bout Cassandra again .=A0
And now i have some concepts about =A0the rea= ding and writing .=A0

We all know Cassandra uses N= WR ,
When read :
the request ---> a random node in Cassandra .= This node acts as a proxy ,and it routes the request.
Here ,=A0
1.=A0the proxy node route this request to this key= 's coordinator , the coordinator then routes request to other N-1 nodes= =A0 OR =A0 the proxy routes the read request to N nodes ?
2. If = it is the former situation , the read repair occurs on the=A0=A0key's c= oordinator ?=A0
=A0=A0 If =A0it is the latter , the=A0=A0read repair occurs on the pro= xy node ?

When write :
the request = ---> a random node in Cassandra .This node acts as a proxy ,and it route= s the request.
Here ,=A0
3.=A0the proxy node route this request to this key= 's coordinator , the coordinator then routes request to other N-1 nodes= =A0 OR =A0 the proxy routes the request to N nodes ?

<= div>
4. The N isn't the data's copy numbers , it's ju= st a =A0range . In this =A0N range , there must be W copies .So W is the co= py numbers.
So in this N range , R+W>N can=A0guarantee the dat= a's=A0validity. Right?




--
Best regards,

Ivy Tang

=




--
Best regards,

Ivy Tang



--005045014644a067c8048f5556be--