hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gautam <gautamkows...@gmail.com>
Subject Applications bottlenecked in ACCEPTED state ..
Date Wed, 26 Oct 2016 03:09:06 GMT
Hello Mighty Hadoop Users,
                                          We'v been running into
applications getting bottlenecked (MR/Tez) now and then. Apps get stuck in
the ACCEPTED state and take random times to reach RUNNING. Our cluster is
not particularly at peak load capacity wise but might be related to sudden
submission of applications.

Scenario that I'm concerned about and trying to fix/optimize:
 - Applications start piling up in ACCEPTED state. App gets submitted,
 transitions  from SUBMITTED to ACCEPTED.  Remains here for 5mins or 10
mins or even 30 mins in many cases doing nothing.
 - Queue of this app, at the time, has available capacity during this time.
 - There is no user-limit configured. We use fair-share scheduler so I
don't think a default user limit is applied. *Please correct me if i'm
wrong*
 - Suddenly get's into RUNNING and finishes as usual.

We use Hadoop 2.6.0 (cdh5.7.4), most concerned configurations are default.
These are all Mapreduce and Tez jobs. I tried
increasing yarn.resourcemanager.scheduler.client.thread-count=100
and yarn.resourcemanager.amlauncher.thread-count=100 but didn't help.

I have attached the RM debug log (filtered by app that was stuck for 11
mins) and NM log for the AM of that app. Would like to know what tuning can
help with this.

Much Appreciated,
-Gautam.

Mime
View raw message