hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manoj Govindassamy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11237) NameNode reports incorrect file size
Date Wed, 21 Dec 2016 20:39:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15768107#comment-15768107

Manoj Govindassamy commented on HDFS-11237:

Hi [~bergenholtz],

{{fs -du}}, that is {{getContentSummary}} is a NameNode only operation. NameNode does have
an account of all allocated blocks for a file. For the currently being-written files, NameNode
doesn't have visibility on the amount of data being written to the last or the UNDER_CONSTRUCTION
Block on various DataNodes. That is, NameNode most of the times lacks the byte level length
accuracy for the files being written. NameNode eventually catches up when the file length
is closed or when client runs hflush or hsync on the write stream. Not just {{du}} command,
but other length query commands/APIs like like {{ls}} or {{getFileStatus}}  also have the
similar behavior. 

> NameNode reports incorrect file size
> ------------------------------------
>                 Key: HDFS-11237
>                 URL: https://issues.apache.org/jira/browse/HDFS-11237
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.7.1
>            Reporter: Erik Bergenholtz
> The [hdfs] file /data/app-logs/log is continuously being written to by yarn process.
> However, checking the file size through: 
> hadoop fs -du /data/app-logs/log shows incorrect file-size after a few minutes.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message