hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bikas Saha <bi...@hortonworks.com>
Subject RE: AM timeout on RM failure?
Date Mon, 12 Aug 2013 17:22:53 GMT
You should probably look at the RMProxy code and the configs it uses. I am
hoping that all clients including the MR AM now use that proxy and so
older configs are no longer valid.

Bikas

-----Original Message-----
From: Karthik Kambatla [mailto:kasha@cloudera.com]
Sent: Sunday, August 11, 2013 8:45 PM
To: yarn-dev@hadoop.apache.org
Subject: AM timeout on RM failure?

Hi YARN devs,

I am working on the ZKRMStateStore, and had a very basic question - on RM
failure, how long does the AM fail before crashing, or more importantly
what controls it.

Looking into the code, I see the following two parameters:

   1. yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms - set to
   1 min
   2. Fix configs

yarn.resourcemanager.resourcemanager.connect.{max.wait.secs|retry_interval
.secs}
   - set by default to 15 mins and 30 seconds respectively

The AM crashes only after 20 minutes.

Are there any other configs that influence this?

Thanks
Karthik

Mime
View raw message