I'm having problems in my Cassandra/Hadoop (1.0.8 + cdh3u3) cluster related to how cassandra splits the data to be processed by Hadoop.

I'm currently testing a map reduce job, starting from a CF of roughly 1500 rows, with 

cassandra.input.split.size 10
cassandra.range.batch.size 1

but what I consistently see is that, while most of the task have 1-20 rows assigned each, one of them is assigned 400+ rows, which gives me all sort of problems in terms of timeouts and memory consumption (not to mention seeing the mapper progress bar going to 4000% and more).

Do you have any suggestion to solve/troublehsoot this issue?

Filippo Diotalevi