hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manoj Govindassamy (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-11237) NameNode reports incorrect file size
Date Wed, 21 Dec 2016 20:49:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15768107#comment-15768107
] 

Manoj Govindassamy edited comment on HDFS-11237 at 12/21/16 8:48 PM:
---------------------------------------------------------------------

Hi [~bergenholtz],

{{fs -du}}, that is {{getContentSummary}} is a NameNode only operation. NameNode does have
an account of all allocated blocks for a file. But, for the currently being-written files,
NameNode doesn't have visibility on the amount of data being written to the last or the UNDER_CONSTRUCTION
Block on various DataNodes. That is, NameNode most of the times lacks the byte level length
accuracy for the files being written. NameNode eventually catches up on the block and file
length when the file is closed or when client runs hflush or hsync on the write stream. Not
just {{du}} command, but other length query commands/APIs like like {{ls}} or {{getFileStatus}}
 also have the similar behavior. 


was (Author: manojg):
Hi [~bergenholtz],

{{fs -du}}, that is {{getContentSummary}} is a NameNode only operation. NameNode does have
an account of all allocated blocks for a file. For the currently being-written files, NameNode
doesn't have visibility on the amount of data being written to the last or the UNDER_CONSTRUCTION
Block on various DataNodes. That is, NameNode most of the times lacks the byte level length
accuracy for the files being written. NameNode eventually catches up when the file length
is closed or when client runs hflush or hsync on the write stream. Not just {{du}} command,
but other length query commands/APIs like like {{ls}} or {{getFileStatus}}  also have the
similar behavior. 

> NameNode reports incorrect file size
> ------------------------------------
>
>                 Key: HDFS-11237
>                 URL: https://issues.apache.org/jira/browse/HDFS-11237
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.7.1
>            Reporter: Erik Bergenholtz
>
> The [hdfs] file /data/app-logs/log is continuously being written to by yarn process.
> However, checking the file size through: 
> hadoop fs -du /data/app-logs/log shows incorrect file-size after a few minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message