hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-9038) Reserved space is erroneously counted towards non-DFS used.
Date Thu, 10 Dec 2015 16:56:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051217#comment-15051217
] 

Arpit Agarwal edited comment on HDFS-9038 at 12/10/15 4:55 PM:
---------------------------------------------------------------

Sorry for the delay in responding, I am out this week. I just did some quick testing with
a single DN using a 40GB ext4 partition.

With Brahma's v005 patch modified to use {{File#getUsableSpace}} non-DFS usage is reported
as 2GB, i.e. ~5%. The actual disk usage was 49MB.
{code}
$ bin/hdfs dfsadmin -report
...
Live datanodes (1):

Name: 127.0.0.1:50010 (localhost)
Hostname: mint0
Decommission Status : Normal
Configured Capacity: 42141548544 (39.25 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 2215555072 (2.06 GB)  <<<<
DFS Remaining: 39925968896 (37.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 94.74%

$ df -h /mnt/sdb/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb         40G   49M   38G   1% /mnt/sdb <<<<
{code}

With {{File#getFreeSpace}} i.e. the v005 patch, the non-DFS used is more accurate.
{code}
$ bin/hdfs dfsadmin -report

...
Live datanodes (1):

Name: 127.0.0.1:50010 (localhost)
Hostname: mint0
Decommission Status : Normal
Configured Capacity: 42141548544 (39.25 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 51302400 (48.93 MB)
{code}

I also checked the current behavior in trunk (no patch). It is already broken, counting the
system-reserved space towards non-DFS used.
{code}
$ bin/hdfs dfsadmin -report
...

Name: 127.0.0.1:50010 (localhost)
Hostname: mint0
Decommission Status : Normal
Configured Capacity: 42141548544 (39.25 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 2215555072 (2.06 GB)
{code}

I think {{File#getFreeSpace}} is the correct choice. However modifying Brahma's patch use
{{File#getUsableSpace}} would not introduce a regression wrt what's in trunk already so it's
fine to change it to address Chris's concern. I'll file a separate Jira to discuss fixing
this part.


was (Author: arpitagarwal):
Sorry for the delay in responding, I am out this week. I just did some quick testing with
a single DN using a 40GB ext4 partition.

With Brahma's v005 patch modified to use {{File#getUsableSpace}} non-DFS usage is reported
as 2GB, i.e. ~5%. The actual disk usage was 49MB.
{code}
$ bin/hdfs dfsadmin -report
...
Live datanodes (1):

Name: 127.0.0.1:50010 (localhost)
Hostname: mint0
Decommission Status : Normal
Configured Capacity: 42141548544 (39.25 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 2215555072 (2.06 GB)  <<<<
DFS Remaining: 39925968896 (37.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 94.74%

$ df -h /mnt/sdb/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb         40G   49M   38G   1% /mnt/sdb <<<<
{code}

With {{File#getFreeSpace}} i.e. the v005 patch, the non-DFS used is more accurate.
{code}
$ bin/hdfs dfsadmin -report

...
Live datanodes (1):

Name: 127.0.0.1:50010 (localhost)
Hostname: mint0
Decommission Status : Normal
Configured Capacity: 42141548544 (39.25 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 51302400 (48.93 MB)
{code}

I also checked the current behavior in trunk (no patch). It is already broken, counting the
system-reserved space towards non-DFS used.
{code}
$ bin/hdfs dfsadmin -report
...

Name: 127.0.0.1:50010 (localhost)
Hostname: mint0
Decommission Status : Normal
Configured Capacity: 42141548544 (39.25 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 2215555072 (2.06 GB)
{code}

I think {{File#getFreeSpace}} is the correct choice. However Brahma's patch does not introduce
a regression so it's fine to go ahead with {{File#getUsableSpace}} to address Chris's concern.
I'll file a separate Jira to discuss fixing this part.

> Reserved space is erroneously counted towards non-DFS used.
> -----------------------------------------------------------
>
>                 Key: HDFS-9038
>                 URL: https://issues.apache.org/jira/browse/HDFS-9038
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.7.1
>            Reporter: Chris Nauroth
>            Assignee: Brahma Reddy Battula
>         Attachments: HDFS-9038-002.patch, HDFS-9038-003.patch, HDFS-9038-004.patch, HDFS-9038-005.patch,
HDFS-9038.patch
>
>
> HDFS-5215 changed the DataNode volume available space calculation to consider the reserved
space held by the {{dfs.datanode.du.reserved}} configuration property.  As a side effect,
reserved space is now counted towards non-DFS used.  I don't believe it was intentional to
change the definition of non-DFS used.  This issue proposes restoring the prior behavior:
do not count reserved space towards non-DFS used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message