hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mingliang Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13031) Rack-aware read bytes stats should be managed by HFDS specific StorageStatistics
Date Tue, 09 Aug 2016 22:20:22 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414339#comment-15414339

Mingliang Liu commented on HADOOP-13031:

[~mingma], in this JIRA, I plan to separate the distance-specific rack-aware read bytes logic
from {{FileSystemStorageStatistics}} to a new HDFS-specific class {{DFSRackAwareStorageStatistics}},
as commented in [HADOOP-13032].

We need to preserve the thread-local mechanism for doing this, tracked by [HADOOP-13435].
I submitted a patch for [HADOOP-13435], and any comments/reviews are very welcomed. After
that is committed, I think the [HADOOP-13032] should be almost done as I also attached the
initial patch. So after those two blockers, I will consolidate the effort of consuming the
StorageStatistics with this JIRA. As you suggested, MR should probe before dumping the rack
distance-aware read bytes so that undefined counters will not display.

> Rack-aware read bytes stats should be managed by HFDS specific StorageStatistics
> --------------------------------------------------------------------------------
>                 Key: HADOOP-13031
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13031
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs
>            Reporter: Mingliang Liu
>            Assignee: Mingliang Liu
> [HADOOP-13065] added a new interface for retrieving FS and FC Statistics. This jira is
to refactor the code that maintains rack-aware read metrics to use the newly added StorageStatistics.
> # Rack-aware read bytes metrics is mostly specific to HDFS. For example, local file system
doesn't need that. We consider to move it from base FileSystemStorageStatistics to a dedicated
HDFS specific StorageStatistics sub-class.
> # We would have to develop an optimized thread-local mechanism to do this, to avoid causing
a performance regression in HDFS stream performance.
> Optionally, it would be better to simply move this to HDFS's existing per-stream {{ReadStatistics}}
for now. As [HDFS-9579] states, ReadStatistics metrics are only accessible via {{DFSClient}}
or {{DFSInputStream}}. Not something that application framework such as MR and Tez can get

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message