From user-return-28614-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Thu Sep 6 23:43:17 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EF827D354 for ; Thu, 6 Sep 2012 23:43:17 +0000 (UTC) Received: (qmail 67448 invoked by uid 500); 6 Sep 2012 23:43:15 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 67423 invoked by uid 500); 6 Sep 2012 23:43:15 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 67409 invoked by uid 99); 6 Sep 2012 23:43:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Sep 2012 23:43:15 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a42.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Sep 2012 23:43:10 +0000 Received: from homiemail-a42.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a42.g.dreamhost.com (Postfix) with ESMTP id 9CDC968C05B for ; Thu, 6 Sep 2012 16:42:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=thelastpickle.com; bh=rim2rFb2BnO2CpDV7l1nNzej41 U=; b=vJjA2ucNVnvEdW2CGetPGd9paC2SnNumP8AaUNkqomvHSnvvitivMYE819 zU1CUkTNknu50AMEoaEkOL5FLbmvHAHNOYQJmaUFKOLixs2kfLFPOWXqwuRHLUI/ q7k2zIWZDIUkR/hU8+WhplL/60kns1rtgt6qYpRFyMNfOnJHo= Received: from [192.168.2.77] (unknown [116.90.132.105]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a42.g.dreamhost.com (Postfix) with ESMTPSA id 1F1A568C058 for ; Thu, 6 Sep 2012 16:42:48 -0700 (PDT) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_FDAF858D-9496-4F70-A5F1-A98E53D83C2F" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 6.0 \(1486\)) Subject: Re: Secondary index read/write explanation Date: Fri, 7 Sep 2012 11:42:48 +1200 References: To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1486) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_FDAF858D-9496-4F70-A5F1-A98E53D83C2F Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 > 1. When a write request is received, it is written to the base CF and = secondary index to secondary (hidden) CF. If this right, will the = secondary index be written local the node or will it follow RP/OPP to = write to nodes. it's local.=20 If an index is to be updated the previous column values from be read = from the primary CF so they can be deleted from the secondary index CF = before inserting the new values. > 2. When a coordinator receives a read request with say predicate x=3Dy = where column x is the secondary index, how does the coordinator query = relevant node(s)? How does it avoid sending it to all nodes if it is = locally indexed? When you ask for x=3Dy the coordinator has no idea the rows for that = query exist in the cluster. If you ask at CL ONE it only does a local = read. If you ask at a higher CL it asks CL nodes for each TokenRange in = the cluster. Or for a restricted token range if you have a key = restriction in the query. > If there is any article/blog that can help understand this better, = please let me know. I think this is still mostly relevant = http://www.datastax.com/docs/0.7/data_model/secondary_indexes Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/09/2012, at 5:32 PM, Venkat Rama wrote: > Hi All, >=20 > I am a new bee to Cassandra and trying to understand how secondary = indexes work. I have been going over the discussion on = https://issues.apache.org/jira/browse/CASSANDRA-749 about local = secondary indexes. And interesting question on = http://www.mail-archive.com/user@cassandra.apache.org/msg16966.html. = The discussion seems to assume that most common uses cases are ones with = range queries. Is this right?=20 >=20 > I am trying to understand the low cardinality reasoning and how the = read gets executed. I have following questions, hoping i can explain my = question well :) >=20 > 1. When a write request is received, it is written to the base CF and = secondary index to secondary (hidden) CF. If this right, will the = secondary index be written local the node or will it follow RP/OPP to = write to nodes. > 2. When a coordinator receives a read request with say predicate x=3Dy = where column x is the secondary index, how does the coordinator query = relevant node(s)? How does it avoid sending it to all nodes if it is = locally indexed? >=20 > If there is any article/blog that can help understand this better, = please let me know. >=20 > Thanks again in advance. >=20 > VR >=20 --Apple-Mail=_FDAF858D-9496-4F70-A5F1-A98E53D83C2F Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 1.  When a write request is received, = it is written to the base CF and secondary index to secondary (hidden) = CF. If this right, will the secondary index be written local the node or = will it follow RP/OPP to write to nodes.
it's = local. 
If an index is to be updated the previous column = values from be read from the primary CF so they can be deleted from the = secondary index CF before inserting the new values.

2.=  When a coordinator receives a read request with say predicate x=3Dy= where column x is the secondary index, how does the coordinator query = relevant node(s)? How does it avoid sending it to all nodes if it is = locally indexed?
When you ask for x=3Dy the = coordinator has no idea the rows for that query exist in the cluster. If = you ask at CL ONE it only does a local read. If you ask at a higher CL = it asks CL nodes for each TokenRange in the cluster. Or for a restricted = token range if you have a key restriction in the = query.

If there is = any article/blog that can help understand this better, please let me = know.
<= br>
Cheers

http://www.thelastpickle.com

On 6/09/2012, at 5:32 PM, Venkat Rama <venkata.s.rama@gmail.com> = wrote:

Hi All,

I am a new bee to Cassandra and = trying to understand how secondary indexes work.  I have been going = over the discussion on https://issue= s.apache.org/jira/browse/CASSANDRA-749 about local secondary = indexes. And interesting question on http://www.mail-archive.com/user@cassandra.apache.org/msg16966.html= .  The discussion seems to assume that most common uses cases are = ones with range queries.  Is this right? 

I am trying to understand the low cardinality = reasoning and how the read gets executed.  I have following = questions, hoping i can explain my question well = :)

1.  When a write request is received, = it is written to the base CF and secondary index to secondary (hidden) = CF. If this right, will the secondary index be written local the node or = will it follow RP/OPP to write to nodes.
2.  When a coordinator receives a read request with say = predicate x=3Dy where column x is the secondary index, how does the = coordinator query relevant node(s)? How does it avoid sending it to all = nodes if it is locally indexed?

If there is any article/blog that can help = understand this better, please let me = know.

Thanks again in = advance.

VR


= --Apple-Mail=_FDAF858D-9496-4F70-A5F1-A98E53D83C2F--