hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9521) TransferFsImage.receiveFile should account and log separate times for image download and fsync to disk
Date Tue, 05 Jan 2016 05:57:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082446#comment-15082446

Harsh J commented on HDFS-9521:

Patch's approach looks good to me. Agreed with [~liuml07], that we can keep the total time
also (but indicate in the message that it includes both times). Alternatively, a single combined
log at the end that prints the total and divided times (along with path info as we have it
in the current patch) would be better too.

I do not agree on DEBUG level though. The change is a refinement of an existing, vital INFO

Please also address the checkstyle issues, if they are relevant (sorry, am too late here and
the build data's been wiped already). You can run checkstyle goal with maven to get the same
results locally.

The failing tests don't appear related.

> TransferFsImage.receiveFile should account and log separate times for image download
and fsync to disk 
> -------------------------------------------------------------------------------------------------------
>                 Key: HDFS-9521
>                 URL: https://issues.apache.org/jira/browse/HDFS-9521
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Minor
>         Attachments: HDFS-9521.patch
> Currently, TransferFsImage.receiveFile is logging total transfer time as below:
> {noformat}
> double xferSec = Math.max(
>        ((float)(Time.monotonicNow() - startTime)) / 1000.0, 0.001);    
> long xferKb = received / 1024;
> LOG.info(String.format("Transfer took %.2fs at %.2f KB/s",xferSec, xferKb / xferSec))
> {noformat}
> This is really useful, but it just measures the total method execution time, which includes
time taken to download the image and do an fsync to all the namenode metadata directories.
> Sometime when troubleshooting these imager transfer problems, it's interesting to know
which part of the process is being the bottleneck (whether network or disk write).
> This patch accounts time for image download and fsync to each disk separately, logging
how much time did it take on each operation.

This message was sent by Atlassian JIRA

View raw message