Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E607F10676 for ; Mon, 21 Oct 2013 21:32:14 +0000 (UTC) Received: (qmail 29287 invoked by uid 500); 21 Oct 2013 21:32:11 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 29160 invoked by uid 500); 21 Oct 2013 21:32:10 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 28920 invoked by uid 99); 21 Oct 2013 21:31:57 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Oct 2013 21:31:57 +0000 Date: Mon, 21 Oct 2013 21:31:57 +0000 (UTC) From: "Sandy Ryza (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-415) Capture memory utilization at the app-level for chargeback MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801112#comment-13801112 ] Sandy Ryza commented on YARN-415: --------------------------------- bq. However since RMAppAttemptImpl and the scheduler are running in separate threads, I could see the scheduler already removing the container before RMAppAttemptImpl received the container completion event and tried to lookup the container for usage calculation. The most accurate timestamp for a container start/end with respect to utilization/chargeback is when the scheduler allocates/releases it, as that's the moment that the resource becomes inaccessible/accessible. Is there a reason we need to set these with an asynchronous event after the scheduler has completed the operation? Why not move the calculation entirely into the scheduler (providing code that could be shared across all schedulers). > Capture memory utilization at the app-level for chargeback > ---------------------------------------------------------- > > Key: YARN-415 > URL: https://issues.apache.org/jira/browse/YARN-415 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager > Affects Versions: 0.23.6 > Reporter: Kendall Thrapp > Assignee: Andrey Klochkov > Attachments: YARN-415--n2.patch, YARN-415--n3.patch, YARN-415--n4.patch, YARN-415--n5.patch, YARN-415--n6.patch, YARN-415--n7.patch, YARN-415--n8.patch, YARN-415.patch > > > For the purpose of chargeback, I'd like to be able to compute the cost of an > application in terms of cluster resource usage. To start out, I'd like to get the memory utilization of an application. The unit should be MB-seconds or something similar and, from a chargeback perspective, the memory amount should be the memory reserved for the application, as even if the app didn't use all that memory, no one else was able to use it. > (reserved ram for container 1 * lifetime of container 1) + (reserved ram for > container 2 * lifetime of container 2) + ... + (reserved ram for container n * lifetime of container n) > It'd be nice to have this at the app level instead of the job level because: > 1. We'd still be able to get memory usage for jobs that crashed (and wouldn't appear on the job history server). > 2. We'd be able to get memory usage for future non-MR jobs (e.g. Storm). > This new metric should be available both through the RM UI and RM Web Services REST API. -- This message was sent by Atlassian JIRA (v6.1#6144)