hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pete Wyckoff (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1463) dfs should report total size of all the space that dfs is using
Date Mon, 10 Mar 2008 19:38:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577139#action_12577139
] 

Pete Wyckoff commented on HADOOP-1463:
--------------------------------------

This patch seems to break reservation space and % use.

The below calculation uses total capacity (which misses unusable disk space on a partition
- for fs metadata and stuff) and also, doesn't take into account any space used on the disk
other than for DFS. Where is /usr accounted for here?  

+      long remaining = getCapacity()-getDfsUsed()-reserved;
+      long available = usage.getAvailable();
+      if (remaining>available) {
+        remaining = available;
+      }
+      remaining = (long)(remaining * usableDiskPct); 
+      return (remaining > 0) ? remaining : 0;

This is a pretty big problem for us as about 90% of our / partitions have < 1 GB free (which
is our reserve param) and 50% have 0 space free. NOTE: we do not use / for map/reduce.


> dfs should report total size of all the space that dfs is using
> ---------------------------------------------------------------
>
>                 Key: HADOOP-1463
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1463
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.12.3
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.15.0
>
>         Attachments: usedSpace.patch, usedSpace.patch
>
>
> Currently namenode reports two statistics back to the client:
> 1. The total capacity of dfs. This is a sum of all datanode's capacities, each of which
is calculated by datanode summing all data directories disk space.
> 2. The total remaining space of dfs. This is a sum of all datanodes's remaining space.
Each datanode's remaining space is calculated by using the following formula: remaining space
= unused space - capacity*unusableDiskPercentage - reserved space. So the remaining space
shows how much space that the dfs can still use, but it does not show the size of unused space.
> Each dfs client caculates the total dfs used space by substracting remaining space from
the total capacity. So the used space does not accurately shows the space that dfs is using.
However it is a very important number that dfs should provide.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message