hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9579) Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
Date Sat, 19 Mar 2016 21:14:33 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202974#comment-15202974
] 

Hudson commented on HDFS-9579:
------------------------------

FAILURE: Integrated in Hadoop-trunk-Commit #9478 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9478/])
HDFS-9579. Provide bytes-read-by-network-distance metrics at (sjlee: rev cd8b6889a74a949e37f4b2eb664cdf3b59bfb93b)
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSStripedInputStream.java
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java
* hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NodeBase.java
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ClientContext.java
* hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/ErasureCodingWorker.java
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReader.java
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestConnCache.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocalLegacy.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessor.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java
* hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetUtils.java
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java


> Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-9579
>                 URL: https://issues.apache.org/jira/browse/HDFS-9579
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>             Fix For: 3.0.0
>
>         Attachments: HDFS-9579-10.patch, HDFS-9579-2.patch, HDFS-9579-3.patch, HDFS-9579-4.patch,
HDFS-9579-5.patch, HDFS-9579-6.patch, HDFS-9579-7.patch, HDFS-9579-8.patch, HDFS-9579-9.patch,
HDFS-9579.patch, MR job counters.png
>
>
> For cross DC distcp or other applications, it becomes useful to have insight as to the
traffic volume for each network distance to distinguish cross-DC traffic, local-DC-remote-rack,
etc.
> FileSystem's existing {{bytesRead}} metrics tracks all the bytes read. To provide additional
metrics for each network distance, we can add additional metrics to FileSystem level and have
{{DFSInputStream}} update the value based on the network distance between client and the datanode.
> {{DFSClient}} will resolve client machine's network location as part of its initialization.
It doesn't need to resolve datanode's network location for each read as {{DatanodeInfo}} already
has the info.
> There are existing HDFS specific metrics such as {{ReadStatistics}} and {{DFSHedgedReadMetrics}}.
But these metrics are only accessible via {{DFSClient}} or {{DFSInputStream}}. Not something
that application framework such as MR and Tez can get to. That is the benefit of storing these
new metrics in FileSystem.Statistics.
> This jira only includes metrics generation by HDFS. The consumption of these metrics
at MR and Tez will be tracked by separated jiras.
> We can add similar metrics for HDFS write scenario later if it is necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message