hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Vasudev (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3248) Display count of nodes blacklisted by apps in the web UI
Date Tue, 03 Mar 2015 09:58:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344848#comment-14344848

Varun Vasudev commented on YARN-3248:

Thanks for the feedback [~ozawa], [~vinodkv]. 

The blacklist is an instance of HashSet, so it can throw ConcurrentModificationException when
blacklist is modified in another thread. One alternative is to use Collections.newSetFromMap(new
ConcurrentHashMap<Object,Boolean>()) instead of HashSet.

Good catch. Collections.newSetFromMap won't work because the blacklist itself is a set. I
create a copy of the structure in the latest patch.

bq. If AbstractYarnScheduler#getApplicationAttempt() can be used, I think it's more straightforward
and simple. What do you think?

Agreed. Changed the code.

bq. Could you add tests to TestRMWebServicesApps?

I'm not sure what tests to add. I'm not adding any new web services.

The blacklist information is per application-attempt, and scheduler will forget previous application-attempts
today. I think this is a general behaviour with the way blacklisting is done today - each
AM is expected to explicitly blacklist all the nodes it wants to blacklist even if the previous
attempt already informed about some of them before. That is how all of resource requests work.
Given the above, we should make it clear that blacklists are really for this app-attempt.

I was under this impression as well, but it the information is maintained on a per app basis
in the AbstractYarnScheduler.
protected Map<ApplicationId, SchedulerApplication<T>> applications;

bq. W.r.t UI, showing the list of all the nodes is going to be a UI scalability problem -
how about we move this list to the per-app page? That is the place where this is useful the

Agreed. Made the change.

bq. We should also add this information to the web-services.

You mean the app information web service?

> Display count of nodes blacklisted by apps in the web UI
> --------------------------------------------------------
>                 Key: YARN-3248
>                 URL: https://issues.apache.org/jira/browse/YARN-3248
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler, resourcemanager
>            Reporter: Varun Vasudev
>            Assignee: Varun Vasudev
>         Attachments: Screenshot.jpg, apache-yarn-3248.0.patch
> It would be really useful when debugging app performance and failure issues to get a
count of the nodes blacklisted by individual apps displayed in the web UI.

This message was sent by Atlassian JIRA

View raw message