hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vrushali C (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior
Date Mon, 10 Oct 2016 18:53:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563137#comment-15563137
] 

Vrushali C commented on YARN-5718:
----------------------------------

Thanks Junping, I looked at the patch and I think I agree that external YARN clients should
not set/override HDFS retry settings. 

Now, with this patch, the yarn config variables TIMELINE_SERVICE_ENTITYGROUP_FS_STORE_RETRY_POLICY_SPEC,
FS_NODE_LABELS_STORE_RETRY_POLICY_SPEC and their defaults are no longer used anywhere in the
code. Should they be removed?
Also, YarnConfiguration.FS_RM_STATE_STORE_RETRY_POLICY_SPEC is used in a test case in TestFSRMStateStore.java,
so should that be changed too? 

> TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings
which could cause unexpected behavior
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-5718
>                 URL: https://issues.apache.org/jira/browse/YARN-5718
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager, timelineclient
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: YARN-5718.patch
>
>
> In one HA cluster, after NN failed over, we noticed that job is getting failed as TimelineClient
failed to retry connection to proper NN. This is because we are overwrite hdfs client settings
that hard code retry policy to be enabled that conflict NN failed-over case - hdfs client
should fail fast so can retry on another NN.
> We shouldn't assume any retry policy for hdfs client at all places in YARN. This should
keep consistent with HDFS settings that has different retry polices in different deployment
case. Thus, we should clean up these hard code settings in YARN, include: FileSystemTimelineWriter,
FileSystemRMStateStore and FileSystemNodeLabelsStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message