incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Takenori Sato <ts...@cloudian.com>
Subject Re: Random Distribution, yet Order Preserving Partitioner
Date Tue, 27 Aug 2013 23:58:46 GMT
Hi Manoj,

Thanks for your advise.

More or less, basically we do the same. As you pointed out, we now face
with many cases that can not be solved by data modeling, and which are
reaching to 100 millions of columns.

We can split them down to multiple pieces of metadata rows, but that will
bring more complexity, thus error prone. If possible, want to avoid that.

- Takenori

2013/08/27 21:37、Manoj Mainali <mainalimanoj@gmail.com> のメッセージ:

Hi Takenori,

I can't tell for sure without knowing what kind of data you have and how
much you have.You can use the random partitioner and use the concept of
metadata row that stores the row key, as for example like below

{metadata_row}: key1 | key2 | key3
key1:column1 | column2

 When you do the read you can always directly query by the key, if you
already know it. In the case of range queries, first you query the
metadata_row and get the keys you want in the ordered fashion. Then you can
do multi_get to get you actual data.

The downside is you have to do two read queries, and depending on how much
data you have you will end up with a wide metadata row.

Manoj


On Fri, Aug 23, 2013 at 8:47 AM, Takenori Sato <tsato@cloudian.com> wrote:

> Hi Nick,
>
> > token and key are not same. it was like this long time ago (single MD5
> assumed single key)
>
> True. That reminds me of making a test with the latest 1.2 instead of our
> current 1.0!
>
> > if you want ordered, you probably can arrange your data in a way so you
> can get it in ordered fashion.
>
> Yeah, we have done for a long time. That's called a wide row, right? Or a
> compound primary key.
>
> It can handle some millions of columns, but not more like 10M. I mean, a
> request for such a row concentrates on a particular node, so the
> performance degrades.
>
> > I also had idea for semi-ordered partitioner - instead of single MD5,
> to have two MD5's.
>
> Sounds interesting. But, we need a fully ordered result.
>
> Anyway, I will try with the latest version.
>
> Thanks,
> Takenori
>
>
> On Thu, Aug 22, 2013 at 6:12 PM, Nikolay Mihaylov <nmmm@nmmm.nu> wrote:
>
>> my five cents -
>> token and key are not same. it was like this long time ago (single MD5
>> assumed single key)
>>
>> if you want ordered, you probably can arrange your data in a way so you
>> can get it in ordered fashion.
>> for example long ago, i had single column family with single key and
>> about 2-3 M columns - I do not suggest you to do it this way, because is
>> wrong way, but it is easy to understand the idea.
>>
>> I also had idea for semi-ordered partitioner - instead of single MD5, to
>> have two MD5's.
>> then you can get semi-ordered ranges, e.g. you get ordered all cities in
>> Canada, all cities in US and so on.
>> however in this way things may get pretty non-ballanced
>>
>> Nick
>>
>>
>>
>>
>>
>> On Thu, Aug 22, 2013 at 11:19 AM, Takenori Sato <tsato@cloudian.com>wrote:
>>
>>> Hi,
>>>
>>> I am trying to implement a custom partitioner that evenly distributes,
>>> yet preserves order.
>>>
>>> The partitioner returns a token by BigInteger as RandomPartitioner does,
>>> while does a decorated key by string as OrderPreservingPartitioner does.
>>> * for now, since IPartitioner<T> does not support different types for
>>> token and key, BigInteger is simply converted to string
>>>
>>> Then, I played around with cassandra-cli. As expected, in my 3 nodes
>>> test cluster, get/set worked, but list(get_range_slices) didn't.
>>>
>>> This came from a challenge to overcome a wide row scalability. So, I
>>> want to make it work!
>>>
>>> I am aware that some efforts are required to make get_range_slices work.
>>> But are there any other critical problems? For example, it seems there is
>>> an assumption that token and key are the same. If this is throughout the
>>> whole C* code, this partitioner is not practical.
>>>
>>> Or have your tried something similar?
>>>
>>> I would appreciate your feedback!
>>>
>>> Thanks,
>>> Takenori
>>>
>>
>>
>

Mime
View raw message