incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clement Honore <honor...@gmail.com>
Subject Help for creating a custom partitioner
Date Fri, 28 Sep 2012 16:20:17 GMT
Hi,****

** **

I have hierarchical data.****

I'm storing them in CF with rowkey somewhat like (category, doc id), and
plenty of columns for a doc definition.****

** **

I have hierarchical data traversal too.****

The user just chooses one category, and then, interact with docs belonging
only to this category.****

** **

1) If I use RandomPartitioner, all docs could be spread within all nodes in
the cluster => bad performance.****

** **

2) Using RandomPartitioner, an alternative design could be rowkey=category
and column name=(doc id, prop name)****

I don't want it because I need fixed column names for indexing purposes,
and the "category" is quite a lonnnng string.****

** **

3) Then, I want to define a new partitioner for my rowkey (category, doc
id), doing MD5 only for the "category" part.****

** **

The question is : with such partitioner, many rows on *one* node are going
to have the same MD5 value, as a result of this new partitioner.****

Is it going to hurt Cassandra behavior ?****

or its performance ?****

** **

Thanks.****

** **

Regards,****

Clément.

Mime
View raw message