hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Lu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4696) EntityGroupFSTimelineStore to work in the absence of an RM
Date Wed, 17 Feb 2016 00:16:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149562#comment-15149562

Li Lu commented on YARN-4696:

Thanks for the work [~stevel@apache.org]! My main question is that what is the assumed use
case for the "non-RM" mode of the reader, other than unit tests? If it's only for unit tests,
are there any ways we can clearly restrict this? Because IIUC, if detached from the RM, all
app states will be unknown and eventually completed. However, the status is not accurate because
it's only a timeout from unknownActiveMillis. 

For unit tests, is it possible to have a mock RM to to the same job? If there are too much
troubles then having this looks fine, but we need to clearly restrict the use case. 

- Line 46, EntityGroupFSTimelineStore, I think we'd incline to avoid import \*s? 
- There is a findbugs warning about an inconsistent synchronization condition for LevelDBCacheTimelineStore,
where we may want to synchronize on the constructor? This is an unrelated failure, so feel
free to skip it. However, if you happen to have time, a quick fix would also be helpful. 

[~xgong] to double check the logic on the writer side. Exception handling looks fine but I
would like to double check the logic on the flush. 

> EntityGroupFSTimelineStore to work in the absence of an RM
> ----------------------------------------------------------
>                 Key: YARN-4696
>                 URL: https://issues.apache.org/jira/browse/YARN-4696
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>         Attachments: YARN-4696-001.patch
> {{EntityGroupFSTimelineStore}} now depends on an RM being up and running; the configuration
pointing to it. This is a new change, and impacts testing where you have historically been
able to test without an RM running.
> The sole purpose of the probe is to automatically determine if an app is running; it
falls back to "unknown" if not. If the RM connection was optional, the "unknown" codepath
could be called directly, relying on age of file as a metric of completion
> Options
> # add a flag to disable RM connect
> # skip automatically if RM not defined/set to
> # disable retries on yarn client IPC; if it fails, tag app as unknown.

This message was sent by Atlassian JIRA

View raw message