incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Takenori Sato <ts...@cloudian.com>
Subject Re: Random Distribution, yet Order Preserving Partitioner
Date Thu, 22 Aug 2013 23:47:47 GMT
Hi Nick,

> token and key are not same. it was like this long time ago (single MD5
assumed single key)

True. That reminds me of making a test with the latest 1.2 instead of our
current 1.0!

> if you want ordered, you probably can arrange your data in a way so you
can get it in ordered fashion.

Yeah, we have done for a long time. That's called a wide row, right? Or a
compound primary key.

It can handle some millions of columns, but not more like 10M. I mean, a
request for such a row concentrates on a particular node, so the
performance degrades.

> I also had idea for semi-ordered partitioner - instead of single MD5, to
have two MD5's.

Sounds interesting. But, we need a fully ordered result.

Anyway, I will try with the latest version.

Thanks,
Takenori


On Thu, Aug 22, 2013 at 6:12 PM, Nikolay Mihaylov <nmmm@nmmm.nu> wrote:

> my five cents -
> token and key are not same. it was like this long time ago (single MD5
> assumed single key)
>
> if you want ordered, you probably can arrange your data in a way so you
> can get it in ordered fashion.
> for example long ago, i had single column family with single key and about
> 2-3 M columns - I do not suggest you to do it this way, because is wrong
> way, but it is easy to understand the idea.
>
> I also had idea for semi-ordered partitioner - instead of single MD5, to
> have two MD5's.
> then you can get semi-ordered ranges, e.g. you get ordered all cities in
> Canada, all cities in US and so on.
> however in this way things may get pretty non-ballanced
>
> Nick
>
>
>
>
>
> On Thu, Aug 22, 2013 at 11:19 AM, Takenori Sato <tsato@cloudian.com>wrote:
>
>> Hi,
>>
>> I am trying to implement a custom partitioner that evenly distributes,
>> yet preserves order.
>>
>> The partitioner returns a token by BigInteger as RandomPartitioner does,
>> while does a decorated key by string as OrderPreservingPartitioner does.
>> * for now, since IPartitioner<T> does not support different types for
>> token and key, BigInteger is simply converted to string
>>
>> Then, I played around with cassandra-cli. As expected, in my 3 nodes test
>> cluster, get/set worked, but list(get_range_slices) didn't.
>>
>> This came from a challenge to overcome a wide row scalability. So, I want
>> to make it work!
>>
>> I am aware that some efforts are required to make get_range_slices work.
>> But are there any other critical problems? For example, it seems there is
>> an assumption that token and key are the same. If this is throughout the
>> whole C* code, this partitioner is not practical.
>>
>> Or have your tried something similar?
>>
>> I would appreciate your feedback!
>>
>> Thanks,
>> Takenori
>>
>
>

Mime
View raw message