Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 68584 invoked from network); 24 Aug 2010 18:58:34 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 24 Aug 2010 18:58:34 -0000 Received: (qmail 58448 invoked by uid 500); 24 Aug 2010 18:58:33 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 58335 invoked by uid 500); 24 Aug 2010 18:58:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 58327 invoked by uid 99); 24 Aug 2010 18:58:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Aug 2010 18:58:32 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of bburruss@real.com designates 207.188.23.7 as permitted sender) Received: from [207.188.23.7] (HELO cir-el.real.com) (207.188.23.7) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Aug 2010 18:58:25 +0000 Received: from seacas01.corp.real.com ([::ffff:192.168.139.56]) (TLS: TLSv1/SSLv3,128bits,AES128-SHA) by cir-el.real.com with esmtp; Tue, 24 Aug 2010 11:58:04 -0700 id 001FC0EB.4C74163C.00001C25 Received: from [172.21.141.200] (192.168.198.6) by seacas01.corp.real.com (192.168.139.56) with Microsoft SMTP Server id 8.2.254.0; Tue, 24 Aug 2010 11:58:03 -0700 Subject: Re: KeyRange.token in 0.7.0 From: "B. Todd Burruss" To: "user@cassandra.apache.org" In-Reply-To: References: <1282674531.3078.70.camel@rnwk-dell> Content-Type: text/plain; charset="UTF-8" Date: Tue, 24 Aug 2010 11:58:03 -0700 Message-ID: <1282676283.3078.71.camel@rnwk-dell> MIME-Version: 1.0 X-Mailer: Evolution 2.28.3 Content-Transfer-Encoding: 7bit Old-Return-Path: bburruss@real.com X-Virus-Checked: Checked by ClamAV on apache.org kew .. hadoop is on my list, has been on my list, will probably still be on the list tomorrow ;) On Tue, 2010-08-24 at 11:53 -0700, Jonathan Ellis wrote: > in other words you're reinventing hadoop. not really recommended, but > knock yourself out if that's what you want to do. :) > > On Tue, Aug 24, 2010 at 1:28 PM, B. Todd Burruss wrote: > > i just came across this and i use tokens in range queries because it is > > an easy straightforward way to divide the keyspace and operate on it > > using multiple threads and throttle the processing. maybe this is what > > hadoop does, i don't know much about hadoop. > > > > so i don't really agree that i'm doing it wrong. why is this? > > > > > > On Wed, 2010-08-18 at 11:18 -0700, Ran Tavory wrote: > >> > >> > >> On Wed, Aug 18, 2010 at 4:30 PM, Jonathan Ellis > >> wrote: > >> (a) if you're using token queries and you're not hadoop, > >> you're doing it wrong > >> ah, didn't know that, so I guess I'll remove support for it from > >> hector... > >> > >> (b) they are expected to be of the form generated by > >> TokenFactory.toString and fromString. You should not be > >> generating > >> them yourself. > >> > >> > >> On Wed, Aug 18, 2010 at 7:56 AM, Ran Tavory > >> wrote: > >> > I'm a bit confused WRT KeyRange's tokens in 0.7.0 > >> > When making a range query you can either use KeyRange.key or > >> KeyRange.token. > >> > In 0.7.0 key was typed as byte[]. tokens remain strings. > >> > What does this string represent in case of a RP and in case > >> of an OPP? Did > >> > this change in 0.7.0? > >> > AFAIK in 0.6.0 if the partitioner is OPP then the tokens are > >> actual strings > >> > and they might just be actual subset of the keys. When using > >> a RP tokens are > >> > BigIntegers (keys are still strings) and I'm not actually > >> sure if you're > >> > allowed to shoot a range query using tokens... > >> > In 0.7.0 since keys are now bytes, when using an OPP, how do > >> those bytes > >> > translate to strings? I'd assume it'd just be byte[] -> UTF8 > >> conversion, > >> > only that this may result in illegal UTF8 chars when keys > >> are just random > >> > bytes, so I guess not... Perhaps md5 hashing? But then if > >> using an OPP and > >> > keys are actual strings, I want to have the same 0.6.0 > >> functionality in > >> > place, meaning tokens are strings like the keys. I actually > >> tested this > >> > scenario and it looks working, so it seems like the String > >> keys are > >> > translated to UTF8, but what happens when they are invalid > >> UTF8? > >> > Another question is what's the story with RP in 0.7.0? > >> Should range query > >> > even be supported with tokens? If so, then are the tokens > >> expected to be > >> > string of integers? (e.g. "1234567890") > >> > Thanks. > >> > >> > >> > >> > >> -- > >> Jonathan Ellis > >> Project Chair, Apache Cassandra > >> co-founder of Riptano, the source for professional Cassandra > >> support > >> http://riptano.com > >> > >> > > > > > > > > >