hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghu Angadi <rang...@yahoo-inc.com>
Subject Re: "Too many open files" error after running a number of jobs
Date Tue, 17 Jul 2007 18:17:18 GMT

The stacktrace is on the client and not on datanode. If it is on linux, 
you can check /proc/pid/fd to see which fds are still open. Usually 1024 
should be a lot for the client (and even on datanode).

Andrzej Bialecki wrote:
> Shailendra Mudgal wrote:
>> Hi ,
>> We have upgraded our code to nutch-0.9 with hadoop-0.12.2-core.jar. After
>> running say 50 nutch jobs(which includes inject/generate/fetch/parse 
>> etc.)
>> we start getting "Too many open files" error on our cluster. We are using
>> Linux box with kernel 2.6.9 and the open files number is 1024 on these
>> machine which is default. I read several mails from nutch-user, or
>> hadoop-user mailing lists. And i found only way was to increase the 
>> number
>> of open files using ulimit. Is there any other solution for this 
>> problem at
>> code level. BTW the value for io.sort.factor is 8 in our hadoop-site.xml.
>> Is anybody having any idea in this regard? Any help will be appreciated.
> Apparently datanodes that perform intensive IO operations need a higher 
> limit. Try increasing this number to 16k or so.

View raw message