hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xuan Gong <xg...@hortonworks.com>
Subject Re: Max Connect retries
Date Mon, 09 Feb 2015 04:42:32 GMT
That is for client connect retry in ipc level.

You can decrease the max.retries by configuring

ipc.client.connect.max.retries.on.timeouts

in core-site.xml


Thanks

Xuan Gong

From: Telles Nobrega <tellesnobrega@gmail.com<mailto:tellesnobrega@gmail.com>>
Reply-To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Date: Saturday, February 7, 2015 at 8:37 PM
To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Max Connect retries

Hi, I changed my cluster config so a failed nodemanager can be detected in about 30 seconds.
When I'm running a wordcount the reduce gets stuck in 25% for a quite while and logs show
nodes trying to connect to the failed node:


org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-telles-844fb3f0-dfd8-456d-89c3-1d7cfdbdcad2/10.3.2.99:49911<http://10.3.2.99:49911>.
Already tried 28 time(s); maxRetries=45
2015-02-08 04:26:42,088 INFO [IPC Server handler 16 on 50037] org.apache.hadoop.mapred.TaskAttemptListenerImpl:
MapCompletionEvents request from attempt_1423319128424_0025_r_000000_0. startIndex 24 maxEvents
10000

Is this the expected behaviour? should I change max retries to a lower values? if so, which
 config is that?

Thanks


Mime
View raw message