hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gera Shegalov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM
Date Mon, 23 Dec 2013 22:54:50 GMT

     [ https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gera Shegalov updated YARN-1529:
--------------------------------

    Attachment: YARN-1529.v01.patch

{noformat}
$ curl -s http://somehost:8042/jmx?qry="Hadoop:service=NodeManager,name=NodeManagerMetrics"
| python -mjson.tool 
{
    "beans": [
        {
            "AllocatedContainers": 0,
            "AllocatedGB": 0,
            "AvailableGB": 8,
            "ContainersCompleted": 1,
            "ContainersFailed": 0,
            "ContainersIniting": 0,
            "ContainersKilled": 1,
            "ContainersLaunched": 2,
            "ContainersRunning": 0,
            "LocalizationDownloadNanos": 1803959000,
            "LocalizedBytesCached": 1529454,
            "LocalizedBytesCachedRatio": 49,
            "LocalizedBytesMissed": 1529546,
            "LocalizedFilesCached": 2,
            "LocalizedFilesCachedRatio": 33,
            "LocalizedFilesMissed": 4,
            "modelerType": "NodeManagerMetrics",
            "name": "Hadoop:service=NodeManager,name=NodeManagerMetrics",
            "tag.Context": "yarn",
            "tag.Hostname": "somehost"
        }
    ]
}
{noformat}

> Add Localization overhead metrics to NM
> ---------------------------------------
>
>                 Key: YARN-1529
>                 URL: https://issues.apache.org/jira/browse/YARN-1529
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Gera Shegalov
>            Assignee: Gera Shegalov
>         Attachments: YARN-1529.v01.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To measure effectiveness
of localization caches it is necessary to expose the overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be fetched from
a central location, typically on HDFS, that results in a number of download requests for the
files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache misses.
> LocalizedFilesCached: total localization requests that were served from local caches.
Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that were served
out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container to go from
ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message