cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-1473) Implement a Cassandra aware Hadoop mapreduce.Partitioner
Date Thu, 11 Aug 2011 13:59:27 GMT


Jonathan Ellis commented on CASSANDRA-1473:

That is, I don't think we can do this for non-RP Cassandra partitioner.

I guess it's still worth doing since RP is the recommended.  Should we return 0, or random(0..numPartitions)
for BOP?

Also not sure how you go about registering your custom mr.Partitioner w/ Hadoop.

> Implement a Cassandra aware Hadoop mapreduce.Partitioner
> --------------------------------------------------------
>                 Key: CASSANDRA-1473
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Patricio Echague
>             Fix For: 1.0
> When using a IPartitioner that does not sort data in byte order (RandomPartitioner for
example) with Cassandra's Hadoop integration, Hadoop is unaware of the output order of the
> We can make Hadoop aware of the proper order of the output data by implementing Hadoop's
mapreduce.Partitioner interface: then Hadoop will handle sorting all of the data according
to Cassandra's IPartitioner, and the writing clients will be able to connect to smaller numbers
of Cassandra nodes.

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message