Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 21795 invoked from network); 2 Feb 2010 23:03:07 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Feb 2010 23:03:07 -0000 Received: (qmail 70472 invoked by uid 500); 2 Feb 2010 23:03:07 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 70448 invoked by uid 500); 2 Feb 2010 23:03:07 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 70439 invoked by uid 99); 2 Feb 2010 23:03:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Feb 2010 23:03:07 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of erikholstad@gmail.com designates 209.85.216.174 as permitted sender) Received: from [209.85.216.174] (HELO mail-px0-f174.google.com) (209.85.216.174) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Feb 2010 23:03:00 +0000 Received: by pxi4 with SMTP id 4so615399pxi.32 for ; Tue, 02 Feb 2010 15:02:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=GCZ73oNb62aSnXaoVGzoqK2UhaDKFeaonVwIj+VqVDA=; b=FqCQSfQSR/o7zJjUKCs+SFLKx977YT8C/S+xWH7lPQe4zCiEPyG1jXKTX10Gi+05cE T2WwfTUxiOLjVeYegr4sLa1/j4rsXFNEn7icXw9FReKhlCmPezV2EodPDG4JlLH+aaRl 5TNjwJsX4oV/qHe8QrC0TvADPYQvNcQFICVu0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=bO2tvWzdSUxMBm/7pA7YJ5+nNKs/0pNQta76nuTqZGS1ESAxr5b+xNbW6mjSnUuou1 dgzA7Mim3t2j0tBv2ZdkaqhuHu9lL921WCHtBolcRpM7NeXdpLeqtttPxDCB9FfueefQ STqU4zGOkk08soV1/TgX+jkC4P537Apb5edq8= MIME-Version: 1.0 Received: by 10.114.237.5 with SMTP id k5mr4512989wah.61.1265151759937; Tue, 02 Feb 2010 15:02:39 -0800 (PST) In-Reply-To: References: <74f4d40b1002020950p66e6f9a1hdea1b5cb4b4e1aa2@mail.gmail.com> <74f4d40b1002021200m5a91b21es3e3a12cd3b80ff0c@mail.gmail.com> Date: Tue, 2 Feb 2010 15:02:39 -0800 Message-ID: <74f4d40b1002021502q3b8034dfu97aef578d39fbc74@mail.gmail.com> Subject: Re: Using column plus value or only column? From: Erik Holstad To: cassandra-user@incubator.apache.org Content-Type: multipart/alternative; boundary=0016e64cbaec3cf2ac047ea619f5 --0016e64cbaec3cf2ac047ea619f5 Content-Type: text/plain; charset=ISO-8859-1 @Nathan So what I'm planning to do is to store multiple sort orders for the same data, where they all use the same data table just fetches it in different orders, so to say. I want to be able to rad the different sort orders from the front and from the back to get both regular and reverse sort order. With your approach using super columns you would need to replicate all data, right? And if I understand http://issues.apache.org/jira/browse/CASSANDRA-598correctly you would need to read the whole thing before you can limit the results handed back to you. In regards to the two calls get_slice and get_range_slice, the way I understand it is that you hand the second one an optional start and stop key plus a limit, to get a range of keys/rows. I was planning to use this call together with the OPP, but are thinking about not using it since there is no way to do an inverse scan, right? Thanks a lot Erik On Tue, Feb 2, 2010 at 2:39 PM, Jesse McConnell wrote: > infinite is a bit of a bold claim.... > > by my understanding you are bound by the memory of the jvm as all of > the content of a key/row currently needs to fit in memory for > compaction, which includes columns and supercolumns for given key/row. > > if you are going to run into those scenarios then some sort of > sharding on the keys is required, afaict > > cheers, > jesse > > -- > jesse mcconnell > jesse.mcconnell@gmail.com > > > > On Tue, Feb 2, 2010 at 16:30, Nathan McCall > wrote: > > Erik, > > Sure, you could and depending on the workload, that might be quite > > efficient for small pieces of data. However, this also sounds like > > something that might be better addressed with the addition of a > > SuperColumn on "Sorts" and getting rid of "Data" altogether: > > > > Sorts : { > > sort_row_1 : { > > sortKey1 : { col1:val1, col2:val2 }, > > sortKey2 : { col1:val3, col2:val4 } > > } > > } > > > > You can have an infinite number of SuperColumns for a key, but make > > sure you understand get_slice vs. get_range_slice before you commit to > > a design. Hopefully I understood your example correctly, if not, do > > you have anything more concrete? > > > > Cheers, > > -Nate > > > > > > On Tue, Feb 2, 2010 at 12:00 PM, Erik Holstad > wrote: > >> Thanks Nate for the example. > >> > >> I was thinking more a long the lines of something like: > >> > >> If you have a family > >> > >> Data : { > >> row1 : { > >> col1:val1, > >> row2 : { > >> col1:val2, > >> ... > >> } > >> } > >> > >> > >> Using > >> Sorts : { > >> sort_row : { > >> sortKey1_datarow1: [], > >> sortKey2_datarow2: [] > >> } > >> } > >> > >> Instead of > >> Sorts : { > >> sort_row : { > >> sortKey1: datarow1, > >> sortKey2: datarow2 > >> } > >> } > >> > >> If that makes any sense? > >> > >> -- > >> Regards Erik > >> > > > -- Regards Erik --0016e64cbaec3cf2ac047ea619f5 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable @Nathan
So what I'm planning to do is to store multiple sort orders = for the same data, where they all use the
same data table just fetches i= t in different orders, so to say. I want to be able to rad the different so= rt
orders from the front and from the back to get both regular and reverse sor= t order.

With your approach using super columns you would need to re= plicate all data, right?

And if I understand http://issues.apache.org/jira/brows= e/CASSANDRA-598 correctly you would need to
read the whole thing before you can limit the results handed back to you.
In regards to the two calls get_slice and get_range_slice, the way I = understand it is that you hand
the second one an optional start and sto= p key plus a limit, to get a range of keys/rows. I was planning
to use this call together with the OPP, but are thinking about not using it= since there is no way to do
an inverse scan, right?

Thanks a lot=
Erik


On Tue, Feb 2, 2010 at 2:39 = PM, Jesse McConnell <jesse.mcconnell@gmail.com> wrote:
infinite is a bit= of a bold claim....

by my understanding you are bound by the memory of the jvm as all of
the content of a key/row currently needs to fit in memory for
compaction, which includes columns and supercolumns for given key/row.

if you are going to run into those scenarios then some sort of
sharding on the keys is required, afaict

cheers,
jesse

--
jesse mcconnell
jesse.mcconnell@gmail.com<= br>



On Tue, Feb 2, 2010 at 16:30, Nathan McCall <nate@vervewireless.com> wrote:
> Erik,
> Sure, you could and depending on the workload, that might be quite
> efficient for small pieces of data. However, this also sounds like
> something that might be better addressed with the addition of a
> SuperColumn on "Sorts" and getting rid of "Data" a= ltogether:
>
> Sorts : {
> =A0 sort_row_1 : {
> =A0 =A0 =A0 =A0sortKey1 : { col1:val1, col2:val2 },
> =A0 =A0 =A0 =A0sortKey2 : { col1:val3, col2:val4 }
> =A0 }
> }
>
> You can have an infinite number of SuperColumns for a key, but make > sure you understand get_slice vs. get_range_slice before you commit to=
> a design. Hopefully I understood your example correctly, if not, do > you have anything more concrete?
>
> Cheers,
> -Nate
>
>
> On Tue, Feb 2, 2010 at 12:00 PM, Erik Holstad <erikholstad@gmail.com> wrote:
>> Thanks Nate for the example.
>>
>> I was thinking more a long the lines of something like:
>>
>> If you have a family
>>
>> Data : {
>> =A0 row1 : {
>> =A0=A0=A0 col1:val1,
>> =A0 row2 : {
>> =A0=A0=A0 col1:val2,
>> =A0=A0=A0 ...
>> =A0 }
>> }
>>
>>
>> Using
>> Sorts : {
>> =A0 sort_row : {
>> =A0=A0=A0 sortKey1_datarow1: [],
>> =A0=A0=A0 sortKey2_datarow2: []
>> =A0 }
>> }
>>
>> Instead of
>> Sorts : {
>> =A0 sort_row : {
>> =A0=A0=A0 sortKey1: datarow1,
>> =A0=A0=A0 sortKey2: datarow2
>> =A0 }
>> }
>>
>> If that makes any sense?
>>
>> --
>> Regards Erik
>>
>



--
Regards Eri= k
--0016e64cbaec3cf2ac047ea619f5--