hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Kendall <mkend...@justin.tv>
Subject common reasons a map task would fail on a distributed cluster but not locally?
Date Sat, 14 Nov 2009 19:22:37 GMT
so if i run my task as:

cat input | ./map.py | ./sum.py > output

it works just fine.  however, running it on my cluster as:

hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-*-streaming.jar -file
map.py -mapper map.py -file cat.py -reducer cat.py -input input -output

it fails.  i'm really confused as to why this script would fail while my
others that were written with the same methodology would work.

is there a "common reasons map tasks fail" list somewhere?  any ideas?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message