hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yazgoo <yaz...@gmail.com>
Subject Re: [HDP][YARN] app timeline server stops with "too many open files" error.
Date Fri, 03 Jun 2016 07:53:47 GMT
Thanks Gagan,

Turns out that on ambari, yarn_user_nofile_limit is already set to 98304.

On the machine in question, I have:

$ grep nofile /etc/security/limits.d/yarn.conf
yarn   - nofile 98304

But still, on the timeline server out log, I have

$ grep open
/var/log/hadoop-yarn/yarn/yarn-yarn-timelineserver-my-machine.out
open files                      (-n) 1024

And looking at the actual limit of the timelineserver when it starts up:

$ ubuntu@ip-10-0-146-185:~$ sudo cat /proc/$(pgrep -f
timelineserver-config)/limits | grep "open files"
Max open files            4096                 4096                 files

Le 02/06/2016 à 19:43, Gagan Brahmi a écrit :
> If the cluster is managed via Ambari it will be ideal to set the limit
> on number of open file for ATS from Ambari. In Ambari under YARN ->
> Configs search for "yarn_user_nofile_limit". Set the value you prefer
> for the number of open files.
>
> In case you are managing the cluster manually you will have to verify
> the security limits configured for the user running ATS (normally the
> user is yarn). Ensure you don't have any specific file for the user
> created under /etc/security/limits.d/ directory. Once you set the
> values you can logout and log back in with the user running ATS.
>
> When you login run the command 'ulimit -n' to check if the new value
> has taken effect. If the cluster is managed via ambari the changes
> won't take affect till you restart the service from Ambari.
>
> HTH.
>
>
> Regards,
> Gagan Brahmi
>
> On Thu, Jun 2, 2016 at 9:06 AM, Eric III <yazgoo@gmail.com> wrote:
>> Hi,
>>
>> I'm having the same issue.
>> I've tried setting ulimit -n in yarn-daemon.sh, it does not seem to work.
>> Jon, What do you mean when you say "check out the ulimit -n" ?
>>
>> Thanks,
>>
>> Eric
>>
>>> Modifying /etc/security/limits.conf won't take immediate effect.
>>> Either restart the system or check out the ulimit -n command and try
>>> again.
>>> Thanks,
>>> Jon
>> On Tue, May 31, 2016 at 11:48 PM, Mungeol Heo <mungeol.heo@gmail.com> wrote:
>>> Hi,
>>>
>>> After starting app timeline server,
>>> '/var/log/hadoop-yarn/yarn/yarn-yarn-timelineserver-hostname.net.log'
>>> gives many message like below.
>>>
>>> ...
>>> 2016-06-01 12:36:28,641 INFO  timeline.RollingLevelDB
>>> (RollingLevelDB.java:initRollingLevelDB(266)) - Added rolling leveldb
>>> instance 2016-05-23-08 to indexes-ldb
>>> 2016-06-01 12:36:28,641 INFO  timeline.RollingLevelDB
>>> (RollingLevelDB.java:initRollingLevelDB(258)) - Initializing rolling
>>> leveldb instance
>>> :file:/data/hadoop/yarn/timeline/leveldb-timeline-store/indexes-ldb.2016-05-09-14
>>> for start time: 1462802400000
>>> ...
>>>
>>> Then, it gives 'too mnay open files' error.
>>>
>>> ...
>>> 2016-06-01 12:36:28,715 INFO  service.AbstractService
>>> (AbstractService.java:noteFailure(272)) - Service
>>> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore
>>> failed in state INITED; cause:
>>> org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error:
>>> /data/hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/LOCK:
>>> Too many open files
>>> org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error:
>>> /data/hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/LOCK:
>>> Too many open files
>>>         at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
>>>         at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
>>>         at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
>>>         at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.serviceInit(RollingLevelDBTimelineStore.java:324)
>>>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>>>         at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.serviceInit(EntityGroupFSTimelineStore.java:151)
>>>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>>>         at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>>>         at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:104)
>>>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>>>         at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:168)
>>>         at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:178)
>>> ...
>>>
>>> I've tired to increase nofile from 65000 to 650000 at
>>> '/etc/security/limits.conf'.
>>> And also increased 'yarn_user_nofile_limit', which is a yarn config,
>>> value to 65536.
>>> However, nothing works.
>>>
>>> Any help will be grate.
>>> Thank you.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>>> For additional commands, e-mail: user-help@hadoop.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: user-help@hadoop.apache.orgEither restart the system
or check out the ulimit -n command and try
>> again.
>>
>> Thanks,
>>
>> Jon
>>
>> On Tue, May 31, 2016 at 11:48 PM, Mungeol Heo <mungeol.heo@gmail.com> wrote:
>>> Hi,
>>>
>>> After starting app timeline server,
>>> '/var/log/hadoop-yarn/yarn/yarn-yarn-timelineserver-hostname.net.log'
>>> gives many message like below.
>>>
>>> ...
>>> 2016-06-01 12:36:28,641 INFO  timeline.RollingLevelDB
>>> (RollingLevelDB.java:initRollingLevelDB(266)) - Added rolling leveldb
>>> instance 2016-05-23-08 to indexes-ldb
>>> 2016-06-01 12:36:28,641 INFO  timeline.RollingLevelDB
>>> (RollingLevelDB.java:initRollingLevelDB(258)) - Initializing rolling
>>> leveldb instance
>>> :file:/data/hadoop/yarn/timeline/leveldb-timeline-store/indexes-ldb.2016-05-09-14
>>> for start time: 1462802400000
>>> ...
>>>
>>> Then, it gives 'too mnay open files' error.
>>>
>>> ...
>>> 2016-06-01 12:36:28,715 INFO  service.AbstractService
>>> (AbstractService.java:noteFailure(272)) - Service
>>> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore
>>> failed in state INITED; cause:
>>> org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error:
>>> /data/hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/LOCK:
>>> Too many open files
>>> org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error:
>>> /data/hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/LOCK:
>>> Too many open files
>>>         at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
>>>         at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
>>>         at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
>>>         at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.serviceInit(RollingLevelDBTimelineStore.java:324)
>>>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>>>         at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.serviceInit(EntityGroupFSTimelineStore.java:151)
>>>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>>>         at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>>>         at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:104)
>>>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>>>         at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:168)
>>>         at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:178)
>>> ...
>>>
>>> I've tired to increase nofile from 65000 to 650000 at
>>> '/etc/security/limits.conf'.
>>> And also increased 'yarn_user_nofile_limit', which is a yarn config,
>>> value to 65536.
>>> However, nothing works.
>>>
>>> Any help will be grate.
>>> Thank you.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>>> For additional commands, e-mail: user-help@hadoop.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: user-help@hadoop.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: user-help@hadoop.apache.org
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org


Mime
View raw message