hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-415) Capture memory utilization at the app-level for chargeback
Date Thu, 14 Aug 2014 21:50:19 GMT

    [ https://issues.apache.org/jira/browse/YARN-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097737#comment-14097737
] 

Eric Payne commented on YARN-415:
---------------------------------

[~jianhe], Thank you very much for reviewing this patch.

{quote}
- we can reuse the previous rmAttempt and resource object
{code}
        RMAppAttempt rmAttempt = container.rmContext.getRMApps()
                   .get(container.getApplicationAttemptId().getApplicationId())
                   .getRMAppAttempt(container.getApplicationAttemptId());
Resource resource = container.getContainer().getResource();
{code}
{quote}

I will reuse the Resource object, but I'm not sure if I can reuse the RMAppAttempt object.

In the following code snippet, the preemption path is always updating the attempt metrics
for the current app attempt. In the chargeback (resource utilization metrics) path, that's
not always what we want. Containers do not always complete before a current attempt dies and
a new one is started. If this happens, the chargeback path should update the metrics for the
first attempt, not the second one. The call to {{...getRMAppAttempt(container.getApplicationAttemptId())}}
will always get the attempt that started the container.

Now that I think about it, it seems like that is what we want in the preemption path as well.

[~leftnoteasy], can you please comment? If the preemption path should update the preemption
info for the attempt that started the finished container, then we can reuse the RMAppAttempt
object for both paths.

{code}
      if (ContainerExitStatus.PREEMPTED == container.finishedStatus
        .getExitStatus()) {
        Resource resource = container.getContainer().getResource();
        RMAppAttempt rmAttempt =
            container.rmContext.getRMApps()
              .get(container.getApplicationAttemptId().getApplicationId())
              .getCurrentAppAttempt();
        rmAttempt.getRMAppAttemptMetrics().updatePreemptionInfo(resource,
          container);
      }

        RMAppAttempt rmAttempt = container.rmContext.getRMApps()
                   .get(container.getApplicationAttemptId().getApplicationId())
                   .getRMAppAttempt(container.getApplicationAttemptId());
{code}

> Capture memory utilization at the app-level for chargeback
> ----------------------------------------------------------
>
>                 Key: YARN-415
>                 URL: https://issues.apache.org/jira/browse/YARN-415
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: resourcemanager
>    Affects Versions: 0.23.6
>            Reporter: Kendall Thrapp
>            Assignee: Andrey Klochkov
>         Attachments: YARN-415--n10.patch, YARN-415--n2.patch, YARN-415--n3.patch, YARN-415--n4.patch,
YARN-415--n5.patch, YARN-415--n6.patch, YARN-415--n7.patch, YARN-415--n8.patch, YARN-415--n9.patch,
YARN-415.201405311749.txt, YARN-415.201406031616.txt, YARN-415.201406262136.txt, YARN-415.201407042037.txt,
YARN-415.201407071542.txt, YARN-415.201407171553.txt, YARN-415.201407172144.txt, YARN-415.201407232237.txt,
YARN-415.201407242148.txt, YARN-415.201407281816.txt, YARN-415.201408062232.txt, YARN-415.201408080204.txt,
YARN-415.201408092006.txt, YARN-415.201408132109.txt, YARN-415.patch
>
>
> For the purpose of chargeback, I'd like to be able to compute the cost of an
> application in terms of cluster resource usage.  To start out, I'd like to get the memory
utilization of an application.  The unit should be MB-seconds or something similar and, from
a chargeback perspective, the memory amount should be the memory reserved for the application,
as even if the app didn't use all that memory, no one else was able to use it.
> (reserved ram for container 1 * lifetime of container 1) + (reserved ram for
> container 2 * lifetime of container 2) + ... + (reserved ram for container n * lifetime
of container n)
> It'd be nice to have this at the app level instead of the job level because:
> 1. We'd still be able to get memory usage for jobs that crashed (and wouldn't appear
on the job history server).
> 2. We'd be able to get memory usage for future non-MR jobs (e.g. Storm).
> This new metric should be available both through the RM UI and RM Web Services REST API.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message