Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 2242 invoked from network); 30 Sep 2010 07:11:21 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Sep 2010 07:11:21 -0000 Received: (qmail 16214 invoked by uid 500); 30 Sep 2010 07:11:19 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 16084 invoked by uid 500); 30 Sep 2010 07:11:16 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 16076 invoked by uid 99); 30 Sep 2010 07:11:14 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Sep 2010 07:11:14 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of decker.christian@gmail.com designates 209.85.215.44 as permitted sender) Received: from [209.85.215.44] (HELO mail-ew0-f44.google.com) (209.85.215.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Sep 2010 07:11:10 +0000 Received: by ewy26 with SMTP id 26so825512ewy.31 for ; Thu, 30 Sep 2010 00:10:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=pwKedF6/22CLCwzTIpzDaD7h0FmKV5P4bAShw5mgz2Y=; b=BASsPTyTS2t8IxNd+U8c4XoLLdjLDwZlvPmxcvyvUZXtOZI10ANV8lRdu6L+TBdpnK xQ7lpbigXZWXl91FPvengrwLRBmJgGc8zf083S6APMO/7aVf9kCGTXCHpLNi695s4AuZ 5uYAWUL7ED+jfRNm+JOYzfvwFWnR2vzQ78VB4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=Kg3nuVgIczM3jXbF4PDfzhV4xdURw3s5CkSv+Zev26zBjFjPc6W+KZ91iwUAJcia72 KBTxD5SJOXxS35Fm9Rvi+LLd5BeI/Qw4/E48zoKrC2U2TR9qS4686EfUavgZxcF2mo0Z SapquOk0GEYDc+uONSq70kPmVoqvP+YG+8cpM= Received: by 10.213.105.129 with SMTP id t1mr2420207ebo.68.1285830648554; Thu, 30 Sep 2010 00:10:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.213.35.207 with HTTP; Thu, 30 Sep 2010 00:10:27 -0700 (PDT) In-Reply-To: <41a3af72-1cd7-e9c9-cc51-ffa0e7095435@me.com> References: <41a3af72-1cd7-e9c9-cc51-ffa0e7095435@me.com> From: Christian Decker Date: Thu, 30 Sep 2010 09:10:27 +0200 Message-ID: Subject: Re: LongType from user input To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=0015174bddc20c6700049174c7f9 --0015174bddc20c6700049174c7f9 Content-Type: text/plain; charset=ISO-8859-1 Apparently I have blanked the 0.7 completely out of my memory. I was trying to implement application layer indices and ignored the fact that Cassandra 0.7 is implementing them by default. I found ticket CASSANDRA-749 about the indices and am reading through the code right now, but is there a higher level overview and a tutorial on how to get things started with these indices (and maybe some inner workings)? This might actually solve all of my problems I'm having right now :-) Regards, Chris On Mon, Sep 27, 2010 at 3:45 AM, Aaron Morton wrote: > The only thing I can think of is that values need to be in the correct byte > format when used in indexes in 0.7. Take a look at the types.py module in > the pycassa client http://github.com/pycassa/pycassa for an example of > which values need to be byte packed. > > How is your pig function working against cassandra? Is it using the > ColumnFamilyRecordReader? . The code in the internal RowIterator for that > class has an example calling the cluster to get to the comparators. > > Aaron > > > On 27 Sep, 2010,at 03:11 AM, Christian Decker > wrote: > > Hi Aaron, > > what changes can I expect in the 0.7 release regarding Comparison and > Parameters? My problem is mainly that I want to take Strings from stdin (or > Pig Scripts for that matter) and convert them in such a way that they are > interpreted correctly and converted to the corresponding byte representation > to use them in column names and keys. > > Regards, > Chris > > On Sun, Sep 26, 2010 at 5:20 AM, Aaron Morton wrote: > >> Things a changing in v0.7, the row keys are byte arrays. >> >> Not sure I understand your other concerns. >> >> Aaron >> >> >> On 25 Sep 2010, at 08:10, Christian Decker >> wrote: >> >> >> Thanks for your quick answer, I think I'll use an affix to sort of cast >> the keys, ranges and others from their textual representation (from Pig) to >> the desired byte representation, since I just noticed that the keys for the >> rows themselfs are always UTF8 interpreted, and since I want to make >> key-range as well as slice queries, I'll be better off this way I think. >> I'll just add a 'L' for Long and 'U' for UUID (of any kind). >> Or is there a better way that I just can't see from my beginners angle? >> :-)thing >> >> Regards, >> Chris >> >> >> On Fri, Sep 24, 2010 at 6:35 PM, Tyler Hobbs < >> tyler@riptano.com> wrote: >> >>> Yes, you can use describe_keyspace() and then look through the results. >>> It's a little ugly in 0.6, but it works >>> >>> - Tyler >>> >>> >>> >>> On Fri, Sep 24, 2010 at 11:25 AM, Christian Decker < >>> decker.christian@gmail.com> wrote: >>> >>>> Well I'm writing a loading function for Pig, and as it happens I want to >>>> be able to load slices from cassandra which are specified in the pig script >>>> (thus the input from stdin) but the ColumnFamily from which to read the data >>>> is another parameter and some of the CFs have UTF8, UUID, TimeUUID or Long >>>> types for their keys and columns, so simply converting everything I get to >>>> an 8byte long would break compatibility with the others. >>>> Now thinking about it I attacked the whole problem in a weird way, since >>>> UUID types won't work either. >>>> So let me change my question slightly, is there a way in 0.6 to detect >>>> the compareWith type on a running cluster? That way I could convert it to >>>> the right type :D >>>> >>>> Regards, >>>> Chris >>>> >>>> >>>> On Fri, Sep 24, 2010 at 6:09 PM, Tyler Hobbs < >>>> tyler@riptano.com> wrote: >>>> >>>>> I'm not sure I understand why using this with multiple column families >>>>> prevents you from converting it. Could you clarify this? >>>>> >>>>> >>>>> >>>>> On Fri, Sep 24, 2010 at 10:56 AM, Christian Decker < >>>>> decker.christian@gmail.com> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I'm having quite a dilemma with the CompareWith attribute. The Problem >>>>>> is that I have numeric IDs that I'd like to use as row keys, only that I >>>>>> also have to offer a possibility to let users input them from std input. >>>>>> Since I cannot ask my users to input an 8byte sequence representing the ID >>>>>> they'd like, I was about to turn to UTF8, when I remembered that they are >>>>>> compared lexicographically, so that 100 actually comes before 2, which kills >>>>>> key slices. Also I cannot just code a converter in since this is supposed to >>>>>> be a used with multiple columnfamilies, so just converting an integer read >>>>>> into 8bytes isn't going to work either. >>>>>> Any tricks for this one? >>>>>> >>>>>> Regards, >>>>>> Chris >>>>>> >>>>> >>>>> >>>> >>> >> > --0015174bddc20c6700049174c7f9 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Apparently I have blanked the 0.7 completely out of my memory. I was trying= to implement application layer indices and ignored the fact that Cassandra= 0.7 is implementing them by default. I found ticket=A0CASSANDRA-749 about = the indices and am reading through the code right now, but is there a highe= r level overview and a tutorial on how to get things started with these ind= ices (and maybe some inner workings)? This might actually solve all of my p= roblems I'm having right now :-)

Regards,
Chris


On Mon, Sep 27, 2010 at 3:45 AM, Aaron Morton <aaron@thelastpickle.co= m> wrote:
The only thing I can think of is = that values need to be in the correct byte format when used in indexes in 0= .7. Take a look at the types.py module in the pycassa client=A0http://github.com/pycas= sa/pycassa=A0for an example of which values need to be byte packed.=A0<= /div>

How is your pig function working against cassandra? Is = it using the ColumnFamilyRecordReader?=A0.=A0The code in the internal RowIt= erator for that class has an example calling the cluster to get to the comp= arators. =A0

Aaron


On 27 Sep, 2010,at = 03:11 AM, Christian Decker <decker.christian@gmail.com> wrote:

Hi Aaron,

=
what changes can I expect in the 0.7 release regarding Compariso= n and Parameters? My problem is mainly that I want to take Strings from std= in (or Pig Scripts for that matter) and convert them in such a way that the= y are interpreted correctly and converted to the corresponding byte represe= ntation to use them in column names and keys.

Regards,
Chris

On Sun, Sep 26, 2010 at 5:20 AM, Aaron Morton <<= a href=3D"mailto:aaron@thelastpickle.com" target=3D"_blank">aaron@thelastpi= ckle.com> wrote:
Things a changing in v0.7, the row= keys are byte arrays.

Not sure I understand your = other concerns.=A0

Aaron


On 25 Sep 2010, at 08:10, Christian D= ecker <d= ecker.christian@gmail.com> wrote:


Thanks for your quick answer, I think I= 'll use an affix to sort of cast the keys, ranges and others from their= textual representation (from Pig) to the desired byte representation, sinc= e I just noticed that the keys for the rows themselfs are always UTF8 inter= preted, and since I want to make key-range as well as slice queries, I'= ll be better off this way I think. I'll just add a 'L' for Long= and 'U' for UUID (of any kind).
Or is there a better way that I just can't see from my beginners angle?= :-)thing

Regards,
Chris


On Fri, Sep 24, 2010 at 6:35 PM, Tyler H= obbs <tyler@rip= tano.com> wrote:
Yes, you can use describe_keyspace() and then look through the results.=A0 = It's a little ugly in 0.6, but it works

= - Tyler



On Fri, Sep 24, 2010 at 11:25 AM, Christian Decker <decker.christian@g= mail.com> wrote:
Well I'm writing a = loading function for Pig, and as it happens I want to be able to load slice= s from cassandra which are specified in the pig script (thus the input from= stdin) but the ColumnFamily from which to read the data is another paramet= er and some of the CFs have UTF8, UUID, TimeUUID or Long types for their ke= ys and columns, so simply converting everything I get to an 8byte long woul= d break compatibility with the others.
Now thinking about it I attacked the whole problem in a weird way, since UU= ID types won't work either.
So let me change my question slightly, = is there a way in 0.6 to detect the compareWith type on a running cluster? = That way I could convert it to the right type :D

Regards,
Chris

=

On Fri, Sep 24, 2010 at 6:09 PM, Tyler = Hobbs <tyler@ri= ptano.com> wrote:
I'm not sure I unde= rstand why using this with multiple column families prevents you from conve= rting it.=A0 Could you clarify this?



On Fri, Sep 24, 2010= at 10:56 AM, Christian Decker <decker.christian@gmail.com> wrote:
Hi all,

<= div>I'm having quite a dilemma with the CompareWith attribute. The Prob= lem is that I have numeric IDs that I'd like to use as row keys, only t= hat I also have to offer a possibility to let users input them from std inp= ut. Since I cannot ask my users to input an 8byte sequence representing the= ID they'd like, I was about to turn to UTF8, when I remembered that th= ey are compared lexicographically, so that 100 actually comes before 2, whi= ch kills key slices. Also I cannot just code a converter in since this is s= upposed to be a used with multiple columnfamilies, so just converting an in= teger read into 8bytes isn't going to work either.
Any tricks for this one?

Regards,
=
Chris






--0015174bddc20c6700049174c7f9--