hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Vyas <jayunit...@gmail.com>
Subject partition as block?
Date Tue, 30 Apr 2013 18:46:04 GMT
Hi guys:

Im wondering - if I'm running mapreduce jobs on a cluster with large block
sizes - can i increase performance with either:

1) A custom FileInputFormat

2) A custom partitioner

3) -DnumReducers

Clearly, (3) will be an issue due to the fact that it might overload tasks
and network traffic... but maybe (1) or (2) will be a precise way to "use"
partitions as a "poor mans" block.

Just a thought - not sure if anyone has tried (1) or (2) before in order to
simulate blocks and increase locality by utilizing the partition API.

Jay Vyas

View raw message