Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of ertiop93@gmail.com designates
 209.85.160.44 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CADVHTB8h2C0AbNsbw3ujN8e4KqrG=maF-==SSKXnWPgB0xrHYw@mail.gmail.com>
References: 
 <CADwHx2qvOXD1dvEFf3XCA5m1G0pw6ueeJh+P6TkEA0jF0TVn5g@mail.gmail.com>
	<4F7175DC.5040804@gmail.com>
	<CALk=J5_USyq8pCaqoZj+Yi4Qb2-vshpH=u9KHwgtTUhwFB9neA@mail.gmail.com>
	<CADVHTB8h2C0AbNsbw3ujN8e4KqrG=maF-==SSKXnWPgB0xrHYw@mail.gmail.com>
Date: Tue, 27 Mar 2012 15:01:23 +0530
Message-ID: 
 <CADwHx2qe2p-JFKRAxbGQuM9L1tirCq_Tg8xxBoM8PJussFohAw@mail.gmail.com>
Subject: Re: Schema advice/help
From: Ertio Lew <ertiop93@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=e89a8ff256a277588004bc3627a5

--e89a8ff256a277588004bc3627a5
Content-Type: text/plain; charset=ISO-8859-1

@R. Verlangen:
You are suggesting to keep a single row for all activities & read all the
columns from the row & then filter, right!?

If done that way (instead of keeping it in 5 rows) then I would need to
retrieve 100s-200s of columns from single row rather than just 50 columns
if I keep in 5 rows.. Which of these two would be better ? More columns
from single row OR less columns from multiple rows ?

On Tue, Mar 27, 2012 at 2:27 PM, R. Verlangen <robin@us2.nl> wrote:

> You can just get a slice range with as start "userId:" and no end.
>
>
> 2012/3/27 Maciej Miklas <mac.miklas@googlemail.com>
>
>> multiget would require Order Preserving Partitioner, and this can lead to
>> unbalanced ring and hot spots.
>>
>> Maybe you can use secondary index on "itemtype" - is must have small
>> cardinality:
>> http://pkghosh.wordpress.com/2011/03/02/cassandra-secondary-index-patterns/
>>
>>
>>
>>
>> On Tue, Mar 27, 2012 at 10:10 AM, Guy Incognito <dnd1066@gmail.com>wrote:
>>
>>> without the ability to do disjoint column slices, i would probably use 5
>>> different rows.
>>>
>>> userId:itemType -> activityId
>>>
>>> then it's a multiget slice of 10 items from each of your 5 rows.
>>>
>>>
>>> On 26/03/2012 22:16, Ertio Lew wrote:
>>>
>>>> I need to store activities by each user, on 5 items types. I always
>>>> want to read last 10 activities on each item type, by a user (ie, total
>>>> activities to read at a time =50).
>>>>
>>>> I am wanting to store these activities in a single row for each user so
>>>> that they can be retrieved in single row query, since I want to read all
>>>> the last 10 activities on each item.. I am thinking of creating composite
>>>> names appending "itemtype" : "activityId"(activityId is just timestamp
>>>> value) but then, I don't see about how to read the last 10 activities from
>>>> all itemtypes.
>>>>
>>>> Any ideas about schema to do this better way ?
>>>>
>>>
>>>
>>
>
>
> --
> With kind regards,
>
> Robin Verlangen
> www.robinverlangen.nl
>
>

--e89a8ff256a277588004bc3627a5
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<br><table cellpadding=3D"0" class=3D"cf gJ" style><tbody><tr class=3D"acZ"=
 style=3D"height:16px"><td class=3D"gF gK" style=3D"margin-top:0px;margin-r=
ight:0px;margin-bottom:0px;margin-left:0px;font-family:arial,sans-serif;tex=
t-align:left;white-space:nowrap;padding-right:8px;vertical-align:top;width:=
273px;padding-top:0px">
<table cellpadding=3D"0" class=3D"cf ix" style=3D"border-collapse:collapse;=
table-layout:fixed;width:273px"><tbody><tr><td style=3D"margin-top:0px;marg=
in-right:0px;margin-bottom:0px;margin-left:0px;font-family:arial,sans-serif=
"><div class=3D"iw" style=3D"overflow-x:hidden;overflow-y:hidden;white-spac=
e:nowrap;max-width:92%;display:inline-block">
<span class=3D"gD" style=3D"font-size:13px;font-weight:bold;display:inline;=
vertical-align:top;color:rgb(34,34,34)">@R. Verlangen:</span></div></td></t=
r></tbody></table></td></tr></tbody></table>You are suggesting to keep a si=
ngle row for all activities &amp; read all the columns from the row &amp; t=
hen filter, right!?=A0<div>
<br></div><div>If done that way (instead of keeping it in 5 rows) then I wo=
uld need to retrieve 100s-200s of columns from single row rather than just =
50 columns if I keep in 5 rows.. Which of these two would be better ? More =
columns from single row OR less columns from multiple rows ?<br>
<br><div class=3D"gmail_quote">On Tue, Mar 27, 2012 at 2:27 PM, R. Verlange=
n <span dir=3D"ltr">&lt;<a href=3D"mailto:robin@us2.nl">robin@us2.nl</a>&gt=
;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex">
You can just get a slice range with as start &quot;userId:&quot; and no end=
.<div class=3D"HOEnZb"><div class=3D"h5"><br><br><div class=3D"gmail_quote"=
>2012/3/27 Maciej Miklas <span dir=3D"ltr">&lt;<a href=3D"mailto:mac.miklas=
@googlemail.com" target=3D"_blank">mac.miklas@googlemail.com</a>&gt;</span>=
<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">multiget would require Order Preserving Part=
itioner, and this can lead to unbalanced ring and hot spots.<br><br>Maybe y=
ou can use secondary index on &quot;itemtype&quot; - is must have small car=
dinality: <a href=3D"http://pkghosh.wordpress.com/2011/03/02/cassandra-seco=
ndary-index-patterns/" target=3D"_blank">http://pkghosh.wordpress.com/2011/=
03/02/cassandra-secondary-index-patterns/</a><div>

<div><br>
<br><br><br><div class=3D"gmail_quote">On Tue, Mar 27, 2012 at 10:10 AM, Gu=
y Incognito <span dir=3D"ltr">&lt;<a href=3D"mailto:dnd1066@gmail.com" targ=
et=3D"_blank">dnd1066@gmail.com</a>&gt;</span> wrote:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">


without the ability to do disjoint column slices, i would probably use 5 di=
fferent rows.<br>
<br>
userId:itemType -&gt; activityId<br>
<br>
then it&#39;s a multiget slice of 10 items from each of your 5 rows.<div><d=
iv></div><div><br>
<br>
On 26/03/2012 22:16, Ertio Lew wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
I need to store activities by each user, on 5 items types. I always want to=
 read last 10 activities on each item type, by a user (ie, total activities=
 to read at a time =3D50).<br>
<br>
I am wanting to store these activities in a single row for each user so tha=
t they can be retrieved in single row query, since I want to read all the l=
ast 10 activities on each item.. I am thinking of creating composite names =
appending &quot;itemtype&quot; : &quot;activityId&quot;(activityId is just =
timestamp value) but then, I don&#39;t see about how to read the last 10 ac=
tivities from all itemtypes.<br>


<br>
Any ideas about schema to do this better way ?<br>
</blockquote>
<br>
</div></div></blockquote></div><br>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div></div><=
/div><span class=3D"HOEnZb"><font color=3D"#888888">-- <br>With kind regard=
s,<div><br></div><div>Robin Verlangen</div><div><a href=3D"http://www.robin=
verlangen.nl" target=3D"_blank">www.robinverlangen.nl</a></div>

<br>
</font></span></blockquote></div><br></div>

--e89a8ff256a277588004bc3627a5--