hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohith Sharma K S <rohithsharm...@huawei.com>
Subject RE: Confusing Yarn RPC Configuration
Date Wed, 19 Aug 2015 07:03:19 GMT
>>> I believe it is the same issue for node manage connection
This would be probably related to below issues
https://issues.apache.org/jira/i#browse/YARN-3944
https://issues.apache.org/jira/i#browse/YARN-3238


Thanks & Regards
Rohith Sharma K S

From: Jeff Zhang [mailto:zjffdu@gmail.com]
Sent: 18 August 2015 09:11
To: user@hadoop.apache.org
Subject: Confusing Yarn RPC Configuration


I use yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms>
to control how much time to wait for setting up RM connection. But the weird thing I found
that this configuration is not the real max wait time. Actually Yarn will convert it to retry
count with configuration yarn.resourcemanager.connect.retry-interval.ms<http://yarn.resourcemanager.connect.retry-interval.ms>.
Let's say yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms>=10000
and  yarn.resourcemanager.connect.retry-interval.ms<http://yarn.resourcemanager.connect.retry-interval.ms>=2000,
then yarn will create RetryUpToMaximumCountWithFixedSleep with max count = 5 (10000/2000)
Because for each RM connection, there's retry policy inside of hadoop RPC. Let's say ipc.client.connect.retry.interval=1000
and ipc.client.connect.max.retries=10, so for each RM connection it will try 10 times and
totally cost 10 seconds (1000*10).  So overall for the RM connection it would cost 50 seconds
(10 * 5), and this number is not consistent with yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms>
which confuse users. I am not sure the purpose of 2 rounds of retry policy (Yarn side and
RPC internal side), should it be only 1 round of retry policy and yarn related configuration
is just for override the RPC configuration ?

BTW, I believe it is the same issue for node manage connection.

--
Best Regards

Jeff Zhang
Mime
View raw message