hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Austin Chungath <austi...@gmail.com>
Subject spawn maps without any input data - hadoop streaming
Date Tue, 16 Jul 2013 09:09:53 GMT

I am trying to generate random data using hadoop streaming & python. It's a
map only job and I need to run a number of maps. There is no input to the
map as it's just going to generate random data.

How do I specify the number of maps to run? ( I am confused here because,
if I am not wrong, the number of maps spawned is related to the input data
size )
Any ideas as to how this can be done?

Warm regards,

View raw message