hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9579) Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
Date Tue, 26 Jan 2016 06:52:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116791#comment-15116791
] 

Ming Ma commented on HDFS-9579:
-------------------------------

Thanks [~sjlee0]! Good point about the thread visibility issue. The reason I ended up using
the map is to make the code more general to support any network distance value without code
change. However due to the fact that the available network distance values don't change often,
using individual long variables seems ok given it addresses the issues you mentioned above.

To use individual long variables, it could be something like below. Note that it assume tree-based
topology; and it should cover the common scenarios. If we need to track network distance values,
we can update it later. In addition, this means bytesReadDistanceOfFour and bytesReadDistanceOfSix
won't be used for small network topology.

{noformat}
volatile long bytesReadLocalHost;
volatile long bytesReadDistanceOfTwo; // local rack case.
volatile long bytesReadDistanceOfFour; // first-degree remote rack
volatile long bytesReadDistanceOfSix; // second-degree remote rack
{noformat}

I will update the patch once we agree on the new approach.

> Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-9579
>                 URL: https://issues.apache.org/jira/browse/HDFS-9579
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HDFS-9579-2.patch, HDFS-9579-3.patch, HDFS-9579-4.patch, HDFS-9579.patch,
MR job counters.png
>
>
> For cross DC distcp or other applications, it becomes useful to have insight as to the
traffic volume for each network distance to distinguish cross-DC traffic, local-DC-remote-rack,
etc.
> FileSystem's existing {{bytesRead}} metrics tracks all the bytes read. To provide additional
metrics for each network distance, we can add additional metrics to FileSystem level and have
{{DFSInputStream}} update the value based on the network distance between client and the datanode.
> {{DFSClient}} will resolve client machine's network location as part of its initialization.
It doesn't need to resolve datanode's network location for each read as {{DatanodeInfo}} already
has the info.
> There are existing HDFS specific metrics such as {{ReadStatistics}} and {{DFSHedgedReadMetrics}}.
But these metrics are only accessible via {{DFSClient}} or {{DFSInputStream}}. Not something
that application framework such as MR and Tez can get to. That is the benefit of storing these
new metrics in FileSystem.Statistics.
> This jira only includes metrics generation by HDFS. The consumption of these metrics
at MR and Tez will be tracked by separated jiras.
> We can add similar metrics for HDFS write scenario later if it is necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message