hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "xieguiming (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-4074) Client continuously retries to RM When RM goes down before launching Application Master
Date Thu, 12 Apr 2012 12:11:20 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

xieguiming updated MAPREDUCE-4074:
----------------------------------

    Status: Patch Available  (was: Open)

Retry Scenarios:
1, AM finished, and will contact HS, and need to retry
2, If RM goes down, the IPC will retry automaticly. and no need retry here.
3, If HS goes down, the IPC will retry automaticly. and no need retry here.

So, we only need to consider scenario 1.
                
> Client continuously retries to RM When RM goes down before launching Application Master
> ---------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4074
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4074
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.23.1
>            Reporter: Devaraj K
>         Attachments: MAPREDUCE-4074.patch
>
>
> Client continuously tries to RM and logs the below messages when the RM goes down before
launching App Master. 
> I feel exception should be thrown or break the loop after finite no of retries.
> {code:xml}
> 28/03/12 07:15:03 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 0 time(s).
> 28/03/12 07:15:04 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 1 time(s).
> 28/03/12 07:15:05 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 2 time(s).
> 28/03/12 07:15:06 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 3 time(s).
> 28/03/12 07:15:07 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 4 time(s).
> 28/03/12 07:15:08 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 5 time(s).
> 28/03/12 07:15:09 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 6 time(s).
> 28/03/12 07:15:10 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 7 time(s).
> 28/03/12 07:15:11 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 8 time(s).
> 28/03/12 07:15:12 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 9 time(s).
> 28/03/12 07:15:13 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 0 time(s).
> 28/03/12 07:15:14 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 1 time(s).
> 28/03/12 07:15:15 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 2 time(s).
> 28/03/12 07:15:16 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 3 time(s).
> 28/03/12 07:15:17 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032.
Already tried 4 time(s).
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message