Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of narendra.sharma@gmail.com
 designates 209.85.161.44 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=s8GVK45r+S4yV6UqUbZNKjyiypI8USNvNhZgewB0shdKMmTeumGG3bEs8ZBRxjyP7w
         LKQtChYctVgl4KWA2g6sXLwMVQTGfb6vG8ZEwlSg/tAQAmRGGAS6GrarD1X60QUEQnI3
         ccmRQFzh2NtBNEfKel1c1A5vj0i/tj3HVvky4=
MIME-Version: 1.0
In-Reply-To: <2ac3cecf-838e-c27e-c844-a7820bdbe492@me.com>
References: <AANLkTikPcJB+8TypE6S3A+cHM6WLuReMfMesrgJo=cOD@mail.gmail.com>
	<2ac3cecf-838e-c27e-c844-a7820bdbe492@me.com>
Date: Thu, 2 Dec 2010 21:58:15 -0800
Message-ID: <AANLkTikb2uRjzGfFZMbOEyCxHrnMX+JVPD3Kq9aDhEs6@mail.gmail.com>
Subject: Re: Fetch a SuperColumn based on value of column
From: Narendra Sharma <narendra.sharma@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=20cf3054a64167607904967b3944

--20cf3054a64167607904967b3944
Content-Type: text/plain; charset=ISO-8859-1

Thanks Aaron!

The first request requires you to know the SuperColumn name. In my case I
don't know the SuperColumn name cause if I know then I can read the super
column. I need to find the SuperColumn that has column with given value for
a given column.
The usecase is that application allows querying object by two attributes. I
have made one of the attribute as Supercolumn name. I need to keep the
second attribute as subcolumn in super column. Now I need to perform search
by subcolumn.
I think the only option is to maintain another CF with column name as the
second attribute with value as the name of super column in current CF. Is
there any better way to handle this?

Thanks,
Naren

On Thu, Dec 2, 2010 at 5:48 PM, Aaron Morton <aaron@thelastpickle.com>wrote:

> You can use column and super column names with the get_slice() function
> without 0.7 secondary indexes. I'm assuming that the original query was to
> test for the existence of a column by name.
>
> In the case below, to retrieve the full super column would require to
> request...
>
> First to test the condition. get_slice with a ColumnParent that specifies
> the CF and the Super Column and a slice predicate with the column_names[]
> containing the name of the col you want. This query would only return the
> one column.
>
> If you then wanted to get all columns in the super column you would make
> another request.
>
> If making two requests is a pain or too slow, consider changing the data
> model to better support the requests you need to make.
>
> AFAIK a lot of super columns will not impact performance any more than a
> lot of column. There are however limitations to the number of columns in a
> super column http://wiki.apache.org/cassandra/CassandraLimitations
> <http://wiki.apache.org/cassandra/CassandraLimitations>
> Hope that helps.
> Aaron
>
>
> On 03 Dec, 2010,at 01:10 PM, Nick Santini <nick.santini@kaseya.com> wrote:
>
> actually, the solution would be something like my last mail, but pointing
> to the name of the super column and the row key
>
>
> Nicolas Santini
> Director of Cloud Computing
> Auckland - New Zealand
> (64) 09 914 9426 ext 2629
> (64) 021 201 3672
>
>
>
> On Fri, Dec 3, 2010 at 1:08 PM, Nick Santini <nick.santini@kaseya.com>wrote:
>
>> Hi,
>> as I got answered on my mail, secondary indexes for super column families
>> is not supported yet, so you have to implement your own
>>
>> easy way: keep another column family where the row key is the value of
>> your field and the columns are the row keys of your super column family
>>
>> (inverted index)
>>
>>
>> Nicolas Santini
>> Director of Cloud Computing
>> Auckland - New Zealand
>> (64) 09 914 9426 ext 2629
>> (64) 021 201 3672
>>
>>
>>
>>
>> On Fri, Dec 3, 2010 at 1:00 PM, Narendra Sharma <
>> narendra.sharma@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> My schema has a row that has thousands of Super Columns. The size of each
>>> super column is around 500B (20 columns). I need to query 1 SuperColumn
>>> based on value of one of its column. Something like
>>>
>>> SELECT SuperColumn FROM Row WHERE SuperColumn.column="value"
>>>
>>> Questions:
>>> 1. Is this possible with current Cassandra APIs? If yes, could you please
>>> show with a sample.
>>> 2. How would such a query perform if the number of SuperColumns is high
>>> (> 10K)?
>>>
>>> Cassandra version 0.7.
>>>
>>> Thanks,
>>> Naren
>>>
>>>
>>
>

--20cf3054a64167607904967b3944
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Thanks Aaron!<br><br>The first request requires you to know the SuperColumn=
 name. In my case I don&#39;t know the SuperColumn name cause if I know the=
n I can read the super column. I need to find the SuperColumn that has colu=
mn with given value for a given column.<br>
The usecase is that application allows querying object by two attributes. I=
 have made one of the attribute as Supercolumn name. I need to keep the sec=
ond attribute as subcolumn in super column. Now I need to perform search by=
 subcolumn.<br>
I think the only option is to maintain another CF with column name as the s=
econd attribute with value as the name of super column in current CF. Is th=
ere any better way to handle this?<br><br>Thanks,<br>Naren<br><br><div clas=
s=3D"gmail_quote">
On Thu, Dec 2, 2010 at 5:48 PM, Aaron Morton <span dir=3D"ltr">&lt;<a href=
=3D"mailto:aaron@thelastpickle.com">aaron@thelastpickle.com</a>&gt;</span> =
wrote:<br><blockquote class=3D"gmail_quote" style=3D"border-left: 1px solid=
 rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div>You can use column and super column names with the get_slice() fu=
nction without 0.7 secondary indexes. I&#39;m assuming that the original qu=
ery was to test for the existence of a column by name.=A0</div><div><br></d=
iv>
<div>In the case below, to retrieve the full super column would require to =
request...</div><div><br></div><div>First to test the condition. get_slice =
with a ColumnParent that specifies the CF and the Super Column and a slice =
predicate with the column_names[] containing the name of the col you want. =
This query would only return the one column.=A0</div>
<div><br></div><div>If you then wanted to get all columns in the super colu=
mn you would make another request.=A0</div><div><br></div><div>If making tw=
o requests is a pain or too slow, consider changing the data model to bette=
r support the requests you need to make.=A0</div>
<div><br></div><div>AFAIK a lot of super columns will not impact performanc=
e any more than a lot of column. There are however limitations to the numbe=
r of columns in a super column=A0<a href=3D"http://wiki.apache.org/cassandr=
a/CassandraLimitations" target=3D"_blank">http://wiki.apache.org/cassandra/=
CassandraLimitations</a></div>
<div><a href=3D"http://wiki.apache.org/cassandra/CassandraLimitations" targ=
et=3D"_blank"></a>=A0</div><div></div><div>Hope that helps.=A0</div><div>Aa=
ron</div><div><div></div><div class=3D"h5"><div><br><br>On 03 Dec, 2010,at =
01:10 PM, Nick Santini &lt;<a href=3D"mailto:nick.santini@kaseya.com" targe=
t=3D"_blank">nick.santini@kaseya.com</a>&gt; wrote:<br>
<br></div><div><blockquote type=3D"cite"><div><div>actually, the solution w=
ould be something like my last mail, but pointing to the name of the super =
column and the row key</div><div><br></div><br clear=3D"all"><div>Nicolas S=
antini</div>
<div>Director of Cloud Computing</div>

<div>Auckland - New Zealand</div><div>(64) 09 914 9426 ext 2629</div><div>(=
64) 021 201 3672</div><br>
<br><br><div class=3D"gmail_quote">On Fri, Dec 3, 2010 at 1:08 PM, Nick San=
tini <span dir=3D"ltr">&lt;<a href=3D"mailto:nick.santini@kaseya.com" targe=
t=3D"_blank">nick.santini@kaseya.com</a>&gt;</span> wrote:<br><blockquote c=
lass=3D"gmail_quote" style=3D"border-left: 1px solid rgb(204, 204, 204); ma=
rgin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">


Hi,<div>as I got answered on my mail, secondary indexes for super column fa=
milies is not supported yet, so you have to implement your own</div><div><b=
r></div><div>easy way: keep another column family where the row key is the =
value of your field and the columns are the row keys of your super column f=
amily</div>


<div><br></div><div>(inverted index)</div><div><br></div><div><br clear=3D"=
all"><div>Nicolas Santini</div><div>Director of Cloud Computing</div><div>A=
uckland - New Zealand</div><div>(64) 09 914 9426 ext 2629</div><div>(64) 02=
1 201 3672</div>


<div><div><br></div><div>
<br>
<br><br><div class=3D"gmail_quote">On Fri, Dec 3, 2010 at 1:00 PM, Narendra=
 Sharma <span dir=3D"ltr">&lt;<a href=3D"mailto:narendra.sharma@gmail.com" =
target=3D"_blank">narendra.sharma@gmail.com</a>&gt;</span> wrote:<br><block=
quote class=3D"gmail_quote" style=3D"border-left: 1px solid rgb(204, 204, 2=
04); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">


Hi,<br><br>My schema has a row that has thousands of Super Columns. The siz=
e of each super column is around 500B (20 columns). I need to query 1 Super=
Column based on value of one of its column. Something like<br><br>SELECT Su=
perColumn FROM Row WHERE SuperColumn.column=3D&quot;value&quot;<br>


<br>Questions:<br>1. Is this possible with current Cassandra APIs? If yes, =
could you please show with a sample.<br>2. How would such a query perform i=
f the number of SuperColumns is high (&gt; 10K)?<br><br>Cassandra version 0=
.7.<br>


<br>Thanks,<br>Naren<br><br>
</blockquote></div><br></div></div></div>
</blockquote></div><br>
</div></blockquote></div></div></div></div></blockquote></div><br>

--20cf3054a64167607904967b3944--