hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-415) Capture memory utilization at the app-level for chargeback
Date Fri, 18 Oct 2013 19:49:42 GMT

    [ https://issues.apache.org/jira/browse/YARN-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799448#comment-13799448

Jason Lowe commented on YARN-415:

It's not just a real-time issue, it's also a correctness issue.  When a container finishes
we need to know the time it was allocated.  So regardless of whether we want to compute the
usage in real-time, the start time of a container and its resource sizes need to be tracked
somewhere in the RM.

ResourceUsage is just a Resource plus a start time, and the Resource should be referencing
the same object already referenced by the Container inside RMContainerImpl.  To implement
this feature we need to track the containers that are allocated/running (already being done
by RMContainerImpl) and what time they started (which we are not currently doing and why ResourceUsage
was created).

There is the issue of the HashMap to map a container ID to its resource and start time.  We
could remove the need for this if we stored the container start time in RMContainerImpl and
had a safe way to lookup containers for an application attempt.  We can get the containers
for an application via scheduler.getSchedulerAppInfo, and RMAppAttemptImpl already does this
when generating an app report.  However since RMAppAttemptImpl and the scheduler are running
in separate threads, I could see the scheduler already removing the container before RMAppAttemptImpl
received the container completion event and tried to lookup the container for usage calculation.
 Given the race, along with the fact that getSchedulerAppInfo is not necessarily cheap, it
seems reasonable to have RMAppAttemptImpl track what it needs for running containers directly.

> Capture memory utilization at the app-level for chargeback
> ----------------------------------------------------------
>                 Key: YARN-415
>                 URL: https://issues.apache.org/jira/browse/YARN-415
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: resourcemanager
>    Affects Versions: 0.23.6
>            Reporter: Kendall Thrapp
>            Assignee: Andrey Klochkov
>         Attachments: YARN-415--n2.patch, YARN-415--n3.patch, YARN-415--n4.patch, YARN-415--n5.patch,
YARN-415--n6.patch, YARN-415--n7.patch, YARN-415--n8.patch, YARN-415.patch
> For the purpose of chargeback, I'd like to be able to compute the cost of an
> application in terms of cluster resource usage.  To start out, I'd like to get the memory
utilization of an application.  The unit should be MB-seconds or something similar and, from
a chargeback perspective, the memory amount should be the memory reserved for the application,
as even if the app didn't use all that memory, no one else was able to use it.
> (reserved ram for container 1 * lifetime of container 1) + (reserved ram for
> container 2 * lifetime of container 2) + ... + (reserved ram for container n * lifetime
of container n)
> It'd be nice to have this at the app level instead of the job level because:
> 1. We'd still be able to get memory usage for jobs that crashed (and wouldn't appear
on the job history server).
> 2. We'd be able to get memory usage for future non-MR jobs (e.g. Storm).
> This new metric should be available both through the RM UI and RM Web Services REST API.

This message was sent by Atlassian JIRA

View raw message