hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcin Mejran <marcin.mej...@hooklogic.com>
Subject Jobtracker memory issues due to FileSystem$Cache
Date Tue, 16 Apr 2013 17:47:10 GMT
We've recently run into jobtracker memory issues on our new hadoop cluster. A heap dump shows
that there are thousands of copies of DistributedFileSystem kept in FileSystem$Cache, a bit
over one for each job run on the cluster and their jobconf objects support this view. I believe
these are created when the .staging directories get cleaned up but I may be wrong on that.

>From what I can tell in the dump, the username (probably not ugi, hard to tell), scheme
and authority parts of the Cache$Key are the same across multiple objects in FileSystem$Cache.
I can only assume that the usergroupinformation piece differs somehow every time it's created.

We're using CDH4.2, MR1, CentOS 6.3 and Java 1.6_31. Kerberos, ldap and so on are not enabled.

Is there any known reason for this type of behavior?


View raw message