hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13065) Add a new interface for retrieving FS and FC Statistics
Date Mon, 16 May 2016 18:09:13 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284962#comment-15284962
] 

Steve Loughran commented on HADOOP-13065:
-----------------------------------------

I'm just hooking this up to S3A, with the actual data being retained in the S3AInstrumentation.

One thing I'd like is to be confident that there were no retained instances when an FS gets
deleted? Does that happen? or put differently, "how can I Get the storage statistics lifecycle
to match that of the specific FS instance"

> Add a new interface for retrieving FS and FC Statistics
> -------------------------------------------------------
>
>                 Key: HADOOP-13065
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13065
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Ram Venkatesh
>            Assignee: Mingliang Liu
>             Fix For: 2.8.0
>
>         Attachments: HADOOP-13065-007.patch, HADOOP-13065.008.patch, HADOOP-13065.009.patch,
HADOOP-13065.010.patch, HADOOP-13065.011.patch, HADOOP-13065.012.patch, HADOOP-13065.013.patch,
HDFS-10175.000.patch, HDFS-10175.001.patch, HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch,
HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java
>
>
> Currently FileSystem.Statistics exposes the following statistics:
> BytesRead
> BytesWritten
> ReadOps
> LargeReadOps
> WriteOps
> These are in-turn exposed as job counters by MapReduce and other frameworks. There is
logic within DfsClient to map operations to these counters that can be confusing, for instance,
mkdirs counts as a writeOp.
> Proposed enhancement:
> Add a statistic for each DfsClient operation including create, append, createSymlink,
delete, exists, mkdirs, rename and expose them as new properties on the Statistics object.
The operation-specific counters can be used for analyzing the load imposed by a particular
job on HDFS. 
> For example, we can use them to identify jobs that end up creating a large number of
files.
> Once this information is available in the Statistics object, the app frameworks like
MapReduce can expose them as additional counters to be aggregated and recorded as part of
job summary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message