I started with the ordered partitioner as I was hoping to make use of the map-reduce functionality. However, my data was likely lopped onto 2 key machines with most of it on one (as seen from another thread. There were also machine failures to blame for the uneven distribution). One solution which I am trying is to load balance. Is there any other thing I can try to convert the partitioner to random on a live system?
I know this sounds like an odd request. Curious about my options though. I did see a post mentioning that one can compute the md5 hash of each key and then insert using that and have a mapping table from key to md5 hash. Unfortunately, the data is already loaded using an ordered partitioner and I was wondering if there is a way to switch to random now.