hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith Sharma K S (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7272) Enable timeline collector fault tolerance
Date Fri, 13 Oct 2017 06:55:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203129#comment-16203129
] 

Rohith Sharma K S commented on YARN-7272:
-----------------------------------------

Update : I had offline discussion with Vinod and his concern is scope of this JIRA is limited
to auxiliary services that runs on NodeManager. Given app collectors can be launched as separate
container which is long term goal but not supported yet, fault tolerance design should consider
all those use cases as well. Otherwise it will end up in redesigning new fault tolerance solution
later.
Thinking wrt to container based app collectors recovery which also holds good for auxiliary
service recovery, storing WAL in HDFS makes more appropriate. 

> Enable timeline collector fault tolerance
> -----------------------------------------
>
>                 Key: YARN-7272
>                 URL: https://issues.apache.org/jira/browse/YARN-7272
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineclient, timelinereader, timelineserver
>            Reporter: Vrushali C
>            Assignee: Rohith Sharma K S
>
> If a NM goes down and along with it the timeline collector aux service for a running
yarn app, we would like that yarn app to re-establish connection with a new timeline collector.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message