[ https://issues.apache.org/jira/browse/HADOOP-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577853#action_12577853 ] Allen Wittenauer commented on HADOOP-2991: ------------------------------------------ Available is *not* trustable. It doesn't always take into consideration quotas, and reserved space. [Oh how many times I've heard users complain with the "df says there is still room but I can't write to my home dir" statement...] Oh, and inode counts. I completely forgot about that little gem. Anyway, yes, I completely agree--this needs to be settable on a per dir basis. That's actually something I've wanted for a while, since we store logs on the same dir as the data node. I want more space available on the node with the logs than the others. I might even have a JIRA on this somewhere.... ahh yes, here it is: https://issues.apache.org/jira/browse/HADOOP-2150 > dfs.du.reserved not honored in 0.15/16 (regression from 0.14+patch for 2549) > ---------------------------------------------------------------------------- > > Key: HADOOP-2991 > URL: https://issues.apache.org/jira/browse/HADOOP-2991 > Project: Hadoop Core > Issue Type: Bug > Components: dfs > Affects Versions: 0.15.0, 0.15.1, 0.15.2, 0.15.3, 0.16.0 > Reporter: Joydeep Sen Sarma > Priority: Critical > > changes for https://issues.apache.org/jira/browse/HADOOP-1463 > have caused a regression. earlier: > - we could set dfs.du.reserve to 1G and be *sure* that 1G would not be used. > now this is no longer true. I am quoting Pete Wyckoff's example: > > Let's look at an example. 100 GB disk and /usr using 45 GB and dfs using 50 GBs now > Df -kh shows: > Capacity = 100 GB > Available = 1 GB (remember ~4 GB chopped out for metadata and stuff) > Used = 95 GBs > remaining = 100 GB - 50 GB - 1GB = 49 GB > Min(remaining, available) = 1 GB > 98% of which is usable for DFS apparently - > So, we're at the limit, but are free to use 98% of the remaining 1GB. > > this is broke. based on the discussion on 1463 - it seems like the notion of 'capacity' as being the first field of 'df' is problematic. For example - here's what our df output looks like: > Filesystem Size Used Avail Use% Mounted on > /dev/sda3 130G 123G 49M 100% / > as u can see - 'Size' is a misnomer - that much space is not available. Rather the actual usable space is 123G+49M ~ 123G. (not entirely sure what the discrepancy is due to - but have heard this may be due to space reserved for file system metadata). Because of this discrepancy - we end up in a situation where file system is out of space. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.