hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Kimball <aa...@cloudera.com>
Subject _temporary directory getting deleted mid-job?
Date Fri, 23 Jan 2009 09:29:27 GMT
I saw some puzzling behavior tonight when running a MapReduce program I

It would perform the mapping just fine, and would begin to shuffle. It got
to 33% complete reduce (end of shuffle) and then the task fails, claiming
that <output_dir>/_temporary was deleted.

I didn't touch HDFS while this was going on.

I tried running the job multiple more times, and this repeated twice more.
Puzzlingly, I was doing bin/hadoop fs -ls <output_dir> periodically in
another window. The _temporary directory got created just fine, but at some
point after shuffling began, it was removed.

I tried to see if I could manually race this, so I did a mkdir _temporary,
and the job proceeded just fine. Even more bizarre, the removal of the
_temporary directory did not occur on any subsequent MR jobs (executions of
the same, unmodified program). So I can't reproduce the bug.

This is on 0.18.2.

It went away, so I'm not *too* concerned, but I'd rather not deal with
heisenbugs if at all possible

So: has anyone seen this behavior? Have you figured out how to reproduce it,
or even better, prevent it?

- Aaron

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message