Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3C48CB086 for ; Wed, 4 Jan 2012 09:16:48 +0000 (UTC) Received: (qmail 9024 invoked by uid 500); 4 Jan 2012 09:16:46 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 7689 invoked by uid 500); 4 Jan 2012 09:16:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 7670 invoked by uid 99); 4 Jan 2012 09:16:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jan 2012 09:16:29 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a49.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jan 2012 09:16:23 +0000 Received: from homiemail-a49.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a49.g.dreamhost.com (Postfix) with ESMTP id 04EED5E0057 for ; Wed, 4 Jan 2012 01:15:58 -0800 (PST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=h8YcAos2Td M7Feolczh1laQ25te7wXOIjBBQS3OiI0NaOsd/ZxRWRvd0Xud8/PPi/Oejkj26ZA NxdXLwO/stW6vt9PhKC9vd27DMupjOZxOG9twdh4NoKCb4UbKN9FXoDHjsRy6hZE WJ/HshD9j2J6bqlvD1DtqGlR+hYrpzwXU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=GtsnRKzn/REwsXC9 0oIWqZe6tMc=; b=pXGRhhlpUHfR2OYiTFmoZT1unHUIrli+rJ/T6bci/CyoN0dJ pkP2KN3ankXK7UEKxWwln6W0xXDLVkEvduwMqK6fctJmCPIYuMwZTyPSyvbO6kuS 6i5WFgPkoMr5gt/QLgWyF60sP/O7BpYzWBxFLwgFQVVoixtt5QfHPPd71ds= Received: from [172.16.1.4] (125-236-193-159.adsl.xtra.co.nz [125.236.193.159]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a49.g.dreamhost.com (Postfix) with ESMTPSA id 16B1A5E0056 for ; Wed, 4 Jan 2012 01:15:56 -0800 (PST) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1251.1) Content-Type: multipart/alternative; boundary="Apple-Mail=_5E41A6B6-8BD2-405F-8C2D-4754A672D9F5" Subject: Re: Consistency Level Date: Wed, 4 Jan 2012 22:15:54 +1300 In-Reply-To: To: user@cassandra.apache.org References: Message-Id: X-Mailer: Apple Mail (2.1251.1) --Apple-Mail=_5E41A6B6-8BD2-405F-8C2D-4754A672D9F5 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 I've not spent much time with the secondary indexes, so a couple of = questions.=20 Whats is the output of nodetool ring ?=20 Which node were you connected to when you did the get ? If you enable DEBUG logging what do the log messages from StorageProxy = say that contain the string "scan ranges are" and "reading .. from ..." =20 Now for the wild guessing=85.It's working as designed for a CL ONE = request. Looking at test case 5 and *assuming* you were connected to = node 2 this is what I think is happening: A get indexed slice query = without a start key does not know which nodes will contain the data. = Reading the code it will consider the replicas for the minimum token as = nodes to send the query to, for a CL ONE query it will only use one. If = you were connected to node 2 the query would have executed only on node = 2.=20 This is where I get confused. What happens if you have 50 nodes, with RF = 3, and you execute a get_indexed_slice at QUOURM with no start_key and = the only rows that satisfy the query exist on nodes 47, 48 and 49. So = they are a long way away from the minimum token, assuming SimpleStrategy = and well ordered token ring.=20 I think I've missed something, anyone ? Cheers =20 ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 4/01/2012, at 9:44 AM, Kamal Bahadur wrote: > Hi Peter, >=20 > To test, I wiped out all the data from Cassandra and inserted just one = record. The row key is "7a7a32323636373030303438303031". I used = getendpoints to see where my data is and double checked the same using = sstable2json command. >=20 > Since the RF is 2, the data is currently on Node 1 and Node 4 of my 4 = nodes cluster. I used cassandra-cli to query the data by using one of = the secondary index but following are my results: >=20 > Test Node 1 Node 2 Node 3 Node 4 Got data back? > 1 Up Up Up Up Yes > 2 Up Up Up=09 > Yes > 3 Up Up=09 > Up Yes > 4=09 > Up Up Up Yes > 5 Up Up=09 >=20 > No > 6=09 >=20 > Up Up No > 7 Up=09 >=20 > Up No >=20 >=20 > It turns out that even though my consistency level is ONE, since I am = using secondary index to query the data, at least 3 nodes has to be = running. And out of these 3 running nodes, it works even if one nodes = contains the data. >=20 > Somewhere in the mailing I read that "Iterating through all of the = rows matching an index clause on your cluster is guaranteed to touch = N/RF of the nodes in your cluster, because each node only knows about = data that is indexed locally." >=20 > I am not sure what N/RF means in my case. Does it mean 4/2 =3D 2? = where 4 is the number of nodes and 2 is the RF. If it is 2, why is it = not returning any data when the two nodes that contains the data is = running (test #7)? >=20 > For my use case, I have to have a RF of 2 and should be able to query = using secondary index with a CL of ONE. Is this possible when 2 nodes = are down in a 4 nodes cluster? Is there any limitations on using = secondary index? >=20 > Thanks in advance. >=20 > Thanks, > Kamal >=20 > On Thu, Dec 29, 2011 at 6:40 PM, Peter Schuller = wrote: > > Thanks for the response Peter! I checked everything and it look good = to me. > > > > I am stuck with this for almost 2 days now. Has anyone had this = issue? >=20 > While it is certainly possible that you're running into a bug, it > seems unlikely to me since it is the kind of bug that would affect > almost anyone if it is failing with Unavailable due to unrelated (not > in replica sets) nodes being down. >=20 > Can you please post back with (1) the ring layout ('nodetool ring'), > and (2) the exact row key that you're testing with? >=20 > You might also want to run with DEBUG level (modify > log4j-server.properties at the top) and the strategy (assuming you are > using NetworkTopologyStrategy) will log selected endpoints, and > confirm that it's indeed picking endpoints that you think it should > based on getendpoints. >=20 > -- > / Peter Schuller (@scode, http://worldmodscode.wordpress.com) >=20 --Apple-Mail=_5E41A6B6-8BD2-405F-8C2D-4754A672D9F5 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252 I've = not spent much time with the secondary indexes, so a couple of = questions. 

Whats is the output of nodetool ring = ? 
Which node were you connected to when you did the get = ?
If you enable DEBUG logging what do the log messages from = StorageProxy say that contain the string "scan ranges are" and "reading = .. from ..."
 
Now for the wild guessing=85.It's = working as designed for a CL ONE request. Looking at test case 5 = and *assuming* you were connected to node 2 this is what I think is = happening: A get indexed slice query without a start key does not = know which nodes will contain the data. Reading the code it will = consider the replicas for the minimum token as nodes to send the query = to, for a CL ONE query it will only use one. If you were connected to = node 2 the query would have executed only on node = 2. 

This is where I get confused. What = happens if you have 50 nodes, with RF 3, and you execute a = get_indexed_slice at QUOURM with no start_key and the only rows that = satisfy the query exist on nodes 47, 48 and 49. So they are a long way = away from the minimum token, assuming SimpleStrategy and well ordered = token ring. 

I think I've missed = something, anyone ?
Cheers
 
http://www.thelastpickle.com

On 4/01/2012, at 9:44 AM, Kamal Bahadur wrote:

Hi = Peter,

To test, I wiped out all the data from Cassandra and = inserted just one record. The row key is = "7a7a32323636373030303438303031". I used getendpoints to see where my = data is and double checked the same using sstable2json command.

Since the RF is 2, the data is currently on Node 1 and Node 4 of my = 4 nodes cluster. I used cassandra-cli to query the data by using one of = the secondary index but following are my results:

Test Node 1 Node 2 Node 3 Node 4 Got data back?
1 Up Up Up Up Yes
2 Up Up Up
Yes
3 Up Up
Up Yes
4
Up Up Up Yes
5 Up Up

No
6

Up Up No
7 Up

Up No


It turns out that even though my consistency = level is ONE, since I am using secondary index to query the data, at = least 3 nodes has to be running. And out of these 3 running nodes, it = works even if one nodes contains the data.

Somewhere in the mailing I read that "Iterating through all of the = rows matching an index clause on your cluster is guaranteed to touch = N/RF of the nodes in your cluster, because each node only knows about = data that is indexed locally."

I am not sure what N/RF means in my case. Does it mean 4/2 =3D 2? = where 4 is the number of nodes and 2 is the RF. If it is 2, why is it = not returning any data when the two nodes that contains the data is = running (test #7)?

For my use case, I have to have a RF of 2 and should be able to = query using secondary index with a CL of ONE. Is this possible when 2 = nodes are down in a 4 nodes cluster? Is there any limitations on using = secondary index?

Thanks in advance.

Thanks,
Kamal



= --Apple-Mail=_5E41A6B6-8BD2-405F-8C2D-4754A672D9F5--