cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Consistency Level
Date Wed, 04 Jan 2012 09:15:54 GMT
I've not spent much time with the secondary indexes, so a couple of questions. 

Whats is the output of nodetool ring ? 
Which node were you connected to when you did the get ?
If you enable DEBUG logging what do the log messages from StorageProxy say that contain the
string "scan ranges are" and "reading .. from ..."
Now for the wild guessing….It's working as designed for a CL ONE request. Looking at test
case 5 and *assuming* you were connected to node 2 this is what I think is happening: A get
indexed slice query without a start key does not know which nodes will contain the data. Reading
the code it will consider the replicas for the minimum token as nodes to send the query to,
for a CL ONE query it will only use one. If you were connected to node 2 the query would have
executed only on node 2. 

This is where I get confused. What happens if you have 50 nodes, with RF 3, and you execute
a get_indexed_slice at QUOURM with no start_key and the only rows that satisfy the query exist
on nodes 47, 48 and 49. So they are a long way away from the minimum token, assuming SimpleStrategy
and well ordered token ring. 

I think I've missed something, anyone ?
Aaron Morton
Freelance Developer

On 4/01/2012, at 9:44 AM, Kamal Bahadur wrote:

> Hi Peter,
> To test, I wiped out all the data from Cassandra and inserted just one record. The row
key is "7a7a32323636373030303438303031". I used getendpoints to see where my data is and double
checked the same using sstable2json command.
> Since the RF is 2, the data is currently on Node 1 and Node 4 of my 4 nodes cluster.
I used cassandra-cli to query the data by using one of the secondary index but following are
my results:
> Test	 Node 1	 Node 2	 Node 3	 Node 4	 Got data back?
> 1	Up	Up	Up	Up	Yes
> 2	Up	Up	Up	
> Yes
> 3	Up	Up	
> Up	Yes
> 4	
> Up	Up	Up	Yes
> 5	Up	Up	
> No
> 6	
> Up	Up	No
> 7	Up	
> Up	No
> It turns out that even though my consistency level is ONE, since I am using secondary
index to query the data, at least 3 nodes has to be running. And out of these 3 running nodes,
it works even if one nodes contains the data.
> Somewhere in the mailing I read that "Iterating through all of the rows matching an index
clause on your cluster is guaranteed to touch N/RF of the nodes in your cluster, because each
node only knows about data that is indexed locally."
> I am not sure what N/RF means in my case. Does it mean 4/2 = 2? where 4 is the number
of nodes and 2 is the RF. If it is 2, why is it not returning any data when the two nodes
that contains the data is running (test #7)?
> For my use case, I have to have a RF of 2 and should be able to query using secondary
index with a CL of ONE. Is this possible when 2 nodes are down in a 4 nodes cluster? Is there
any limitations on using secondary index?
> Thanks in advance.
> Thanks,
> Kamal
> On Thu, Dec 29, 2011 at 6:40 PM, Peter Schuller <> wrote:
> > Thanks for the response Peter! I checked everything and it look good to me.
> >
> > I am stuck with this for almost 2 days now. Has anyone had this issue?
> While it is certainly possible that you're running into a bug, it
> seems unlikely to me since it is the kind of bug that would affect
> almost anyone if it is failing with Unavailable due to unrelated (not
> in replica sets) nodes being down.
> Can you please post back with (1) the ring layout ('nodetool ring'),
> and (2) the exact row key that you're testing with?
> You might also want to run with DEBUG level (modify
> at the top) and the strategy (assuming you are
> using NetworkTopologyStrategy) will log selected endpoints, and
> confirm that it's indeed picking endpoints that you think it should
> based on getendpoints.
> --
> / Peter Schuller (@scode,

View raw message