hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Forsberg <forsb...@opera.com>
Subject Debugging Partitioner problems
Date Wed, 20 Jan 2010 12:03:28 GMT

I have a problem with one of my reducers getting 3 times as much
data as the other 15 reducers, causing longer total runtime per job.

What would be the best way to debug this? I'm guessing I'm outputting
keys that somehow fool the partitioner. Can I tell hadoop to save the
map outputs per reducer to be able to inspect what's in them?

Erik Forsberg <forsberg@opera.com>
Developer, Opera Software - http://www.opera.com/

View raw message