hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tiago Macambira <macamb...@gmail.com>
Subject Large number of map output keys and performance issues.
Date Wed, 06 May 2009 19:23:46 GMT
I am developing a MR application w/ hadoop that is generating during it's
map phase a really large number of output keys and it is having an abysmal
performance.

While just reading the said data takes 20 minutes and processing it but not
outputting anything from the map takes around 30 min, running the full
application takes around 4 hours. Is this a known or expected issue?

Cheers.
Tiago Alves Macambira
--
"I may be drunk, but in the morning I will be sober, while you will
still be stupid and ugly." -Winston Churchill

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message