hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mingliang Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
Date Thu, 07 Apr 2016 03:21:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15229596#comment-15229596
] 

Mingliang Liu commented on HDFS-10175:
--------------------------------------

Thanks for your test to quantify the overhead of the per-op stats map. The data you got is
similar to our offline analysis. As to the goal that the ~1.5K overhead per FileSystem/thread
is trading for, I think the basic idea is to provide file system operation summary in a fine-grained
granularity. Before this, we have FS operations counters as read ops, write ops, and largeReadOps
etc, which are not very useful for offline load analysis. One simple use case is: one directory
is polluted by 1K small files because a job forgets to delete the temporary files after using
them. The {{create/delete}} counters will help users/admins locate the bad job very easily.
WriteOps is not very indicative as it counts many other operations. I believe [~hitesh] can
show us more examples if you find his comment above not quite clear.

As the statistics is used by other file systems (e.g. S3A) besides HDFS, the per-operation
counters can be supported by those file systems. FileSystem itself has several high-level
operations that are not supported by all concrete file systems, in which case a zero counter
value for an unsupported operation seems OK. Maybe [~jnp] has more comments about this problem.

Yes the {{HAS_NEXT}} is not very a metric that is interesting to users. I'll update the patch
for iterative {{listStatus}}. This is a very good catch.

In the very early stage of the [HDFS-9579], it used a map (perhaps because it's more straightforward).
I think the idea of moving hard-coded statistics longs into maps in this case is good. I'll
file new jira to address this. Ping [~mingma] for more input.

By the way, I think it's good to move the {{Statistics}} class out of {{FileSystem}}. However,
it's incompatible as the usage of "import" should be updated in upper applications.

> add per-operation stats to FileSystem.Statistics
> ------------------------------------------------
>
>                 Key: HDFS-10175
>                 URL: https://issues.apache.org/jira/browse/HDFS-10175
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Ram Venkatesh
>            Assignee: Mingliang Liu
>         Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, HDFS-10175.002.patch,
HDFS-10175.003.patch, TestStatisticsOverhead.java
>
>
> Currently FileSystem.Statistics exposes the following statistics:
> BytesRead
> BytesWritten
> ReadOps
> LargeReadOps
> WriteOps
> These are in-turn exposed as job counters by MapReduce and other frameworks. There is
logic within DfsClient to map operations to these counters that can be confusing, for instance,
mkdirs counts as a writeOp.
> Proposed enhancement:
> Add a statistic for each DfsClient operation including create, append, createSymlink,
delete, exists, mkdirs, rename and expose them as new properties on the Statistics object.
The operation-specific counters can be used for analyzing the load imposed by a particular
job on HDFS. 
> For example, we can use them to identify jobs that end up creating a large number of
files.
> Once this information is available in the Statistics object, the app frameworks like
MapReduce can expose them as additional counters to be aggregated and recorded as part of
job summary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message