hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank Singleton <b17fly...@gmail.com>
Subject Re: Task process exit with nonzero status of 1
Date Fri, 09 Oct 2009 18:27:06 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Feng, Ao wrote:
> I probably know what the problem it, as we are encountering the same issue on our prod
cluster. Every once a while jobs start failing on the same task trackers, and the only error
message is this exit status 1.
> 
> Go to the userlogs directory on the host where your tasks fail, and verify if there are
31,999 directories all looking like attempt_... Once you get to that point, JVM would run
out of file descriptors, as it tries to create the 32,000 one. I confirmed that cleaning up
the userlogs directory solves the problem... temporarily.
> 
> So my questions are:
> 
> 1. Where is the 32,000 limit imposed, and how do we change it?

> 

As far as ext3 file system capabilities are concerned,

http://en.wikipedia.org/wiki/Ext3

Specifically

<quote>
"There is a limit of 31998 sub-directories per one directory, stemming from its limit of 32000
links per inode"
</quote>


There is actually a funny story behind my personal experience with this (which I shall shorten
for
brevity)

After I  typed "ls <tab>" (to get the list of files/directories via bash completion)
one day in a directory,
the system came back (after a while) and said (from memory),

Display all 31998 possibilities? (y or n)

Hmm, where have I seen a number like (or close to) that before ?

Cheers / Frank

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iEYEARECAAYFAkrPgGMACgkQpZzN+MMic6cv2ACfQ7xTuIvXnx1VkmhNwJwW7Xlc
lugAn38nuAOKcDUFx/BokcuPcHBEbmIH
=O0fm
-----END PGP SIGNATURE-----

Mime
View raw message