hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arni Sumarlidason <sumarlida...@gmail.com>
Subject processing data evenly
Date Wed, 02 Sep 2015 21:08:45 GMT
I'm having problems getting my data reduced evenly across nodes.

-> map a 200,000 line single text file and output <0L,line>
-> custom partitioner returning static member i++%numPartitions in an
attempt to distribute each line to as many reducers as possible
-> reduce; I end up with 13 or 18 nodes busy of 100 nodes.

My hope is to have 300 containers on 100 nodes; each with ~666 lines each.
How can i achieve this?

View raw message