cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: KeyRange.token in 0.7.0
Date Tue, 24 Aug 2010 18:53:52 GMT
in other words you're reinventing hadoop.  not really recommended, but
knock yourself out if that's what you want to do. :)

On Tue, Aug 24, 2010 at 1:28 PM, B. Todd Burruss <bburruss@real.com> wrote:
> i just came across this and i use tokens in range queries because it is
> an easy straightforward way to divide the keyspace and operate on it
> using multiple threads and throttle the processing.  maybe this is what
> hadoop does, i don't know much about hadoop.
>
> so i don't really agree that i'm doing it wrong.  why is this?
>
>
> On Wed, 2010-08-18 at 11:18 -0700, Ran Tavory wrote:
>>
>>
>> On Wed, Aug 18, 2010 at 4:30 PM, Jonathan Ellis <jbellis@gmail.com>
>> wrote:
>>         (a) if you're using token queries and you're not hadoop,
>>         you're doing it wrong
>> ah, didn't know that, so I guess I'll remove support for it from
>> hector...
>>
>>         (b) they are expected to be of the form generated by
>>         TokenFactory.toString and fromString. You should not be
>>         generating
>>         them yourself.
>>
>>
>>         On Wed, Aug 18, 2010 at 7:56 AM, Ran Tavory <rantav@gmail.com>
>>         wrote:
>>         > I'm a bit confused WRT KeyRange's tokens in 0.7.0
>>         > When making a range query you can either use KeyRange.key or
>>         KeyRange.token.
>>         > In 0.7.0 key was typed as byte[]. tokens remain strings.
>>         > What does this string represent in case of a RP and in case
>>         of an OPP? Did
>>         > this change in 0.7.0?
>>         > AFAIK in 0.6.0 if the partitioner is OPP then the tokens are
>>         actual strings
>>         > and they might just be actual subset of the keys. When using
>>         a RP tokens are
>>         > BigIntegers (keys are still strings) and I'm not actually
>>         sure if you're
>>         > allowed to shoot a range query using tokens...
>>         > In 0.7.0 since keys are now bytes, when using an OPP, how do
>>         those bytes
>>         > translate to strings? I'd assume it'd just be byte[] -> UTF8
>>         conversion,
>>         > only that this may result in illegal UTF8 chars when keys
>>         are just random
>>         > bytes, so I guess not... Perhaps md5 hashing? But then if
>>         using an OPP and
>>         > keys are actual strings, I want to have the same 0.6.0
>>         functionality in
>>         > place, meaning tokens are strings like the keys. I actually
>>         tested this
>>         > scenario and it looks working, so it seems like the String
>>         keys are
>>         > translated to UTF8, but what happens when they are invalid
>>         UTF8?
>>         > Another question is what's the story with RP in 0.7.0?
>>         Should range query
>>         > even be supported with tokens? If so, then are the tokens
>>         expected to be
>>         > string of integers? (e.g. "1234567890")
>>         > Thanks.
>>
>>
>>
>>
>>         --
>>         Jonathan Ellis
>>         Project Chair, Apache Cassandra
>>         co-founder of Riptano, the source for professional Cassandra
>>         support
>>         http://riptano.com
>>
>>
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message