incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "B. Todd Burruss" <bburr...@real.com>
Subject Re: KeyRange.token in 0.7.0
Date Tue, 24 Aug 2010 18:58:03 GMT
kew .. hadoop is on my list, has been on my list, will probably still be
on the list tomorrow ;)

On Tue, 2010-08-24 at 11:53 -0700, Jonathan Ellis wrote:
> in other words you're reinventing hadoop.  not really recommended, but
> knock yourself out if that's what you want to do. :)
> 
> On Tue, Aug 24, 2010 at 1:28 PM, B. Todd Burruss <bburruss@real.com> wrote:
> > i just came across this and i use tokens in range queries because it is
> > an easy straightforward way to divide the keyspace and operate on it
> > using multiple threads and throttle the processing.  maybe this is what
> > hadoop does, i don't know much about hadoop.
> >
> > so i don't really agree that i'm doing it wrong.  why is this?
> >
> >
> > On Wed, 2010-08-18 at 11:18 -0700, Ran Tavory wrote:
> >>
> >>
> >> On Wed, Aug 18, 2010 at 4:30 PM, Jonathan Ellis <jbellis@gmail.com>
> >> wrote:
> >>         (a) if you're using token queries and you're not hadoop,
> >>         you're doing it wrong
> >> ah, didn't know that, so I guess I'll remove support for it from
> >> hector...
> >>
> >>         (b) they are expected to be of the form generated by
> >>         TokenFactory.toString and fromString. You should not be
> >>         generating
> >>         them yourself.
> >>
> >>
> >>         On Wed, Aug 18, 2010 at 7:56 AM, Ran Tavory <rantav@gmail.com>
> >>         wrote:
> >>         > I'm a bit confused WRT KeyRange's tokens in 0.7.0
> >>         > When making a range query you can either use KeyRange.key or
> >>         KeyRange.token.
> >>         > In 0.7.0 key was typed as byte[]. tokens remain strings.
> >>         > What does this string represent in case of a RP and in case
> >>         of an OPP? Did
> >>         > this change in 0.7.0?
> >>         > AFAIK in 0.6.0 if the partitioner is OPP then the tokens are
> >>         actual strings
> >>         > and they might just be actual subset of the keys. When using
> >>         a RP tokens are
> >>         > BigIntegers (keys are still strings) and I'm not actually
> >>         sure if you're
> >>         > allowed to shoot a range query using tokens...
> >>         > In 0.7.0 since keys are now bytes, when using an OPP, how do
> >>         those bytes
> >>         > translate to strings? I'd assume it'd just be byte[] -> UTF8
> >>         conversion,
> >>         > only that this may result in illegal UTF8 chars when keys
> >>         are just random
> >>         > bytes, so I guess not... Perhaps md5 hashing? But then if
> >>         using an OPP and
> >>         > keys are actual strings, I want to have the same 0.6.0
> >>         functionality in
> >>         > place, meaning tokens are strings like the keys. I actually
> >>         tested this
> >>         > scenario and it looks working, so it seems like the String
> >>         keys are
> >>         > translated to UTF8, but what happens when they are invalid
> >>         UTF8?
> >>         > Another question is what's the story with RP in 0.7.0?
> >>         Should range query
> >>         > even be supported with tokens? If so, then are the tokens
> >>         expected to be
> >>         > string of integers? (e.g. "1234567890")
> >>         > Thanks.
> >>
> >>
> >>
> >>
> >>         --
> >>         Jonathan Ellis
> >>         Project Chair, Apache Cassandra
> >>         co-founder of Riptano, the source for professional Cassandra
> >>         support
> >>         http://riptano.com
> >>
> >>
> >
> >
> >
> 
> 
> 



Mime
View raw message