hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pete Wyckoff (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2991) dfs.du.reserved not honored in 0.15/16 (regression from 0.14+patch for 2549)
Date Tue, 11 Mar 2008 01:10:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577266#action_12577266
] 

Pete Wyckoff commented on HADOOP-2991:
--------------------------------------


The formula should be:

min(((DF.Capacity - Conf.Reserved) * Conf.dfs.du.pct) -DU.dfsSpace()), (DF.available() - Conf.reserved));

Now, as Joy says, DF.Capacity is actually not all usable - some of it is used for meta info
for the filesystem.

So capacity should be DF.available() + DF.used()

Also, Hairong, I understand the new meaning for reserved, but it really is too hard to use.
We'd have to figure out on every machine on every drive what this amount of space is.  The
older semantics are much easier to use and helps a lot. Now that I'm a user :) I can see that
being able to say to Hadoop (dfs and mapred), never use the last 1 GB or last .5 GB or whatever
for safety reasons is helpful.

Yes, this means that the amount of space for DFS will fluctuate, but so what. When there's
not enough space due to other things on the drive, the drive isn't used but when there's space,
it is.

-- pete

-- pete











> dfs.du.reserved not honored in 0.15/16 (regression from 0.14+patch for 2549)
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2991
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2991
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.15.0, 0.15.1, 0.15.2, 0.15.3, 0.16.0
>            Reporter: Joydeep Sen Sarma
>            Priority: Critical
>
> changes for https://issues.apache.org/jira/browse/HADOOP-1463
> have caused a regression. earlier:
> - we could set dfs.du.reserve to 1G and be *sure* that 1G would not be used.
> now this is no longer true. I am quoting Pete Wyckoff's example:
> <example>
> Let's look at an example. 100 GB disk and /usr using 45 GB and dfs using 50 GBs now
> Df -kh shows:
> Capacity = 100 GB
> Available = 1 GB (remember ~4 GB chopped out for metadata and stuff)
> Used = 95 GBs   
> remaining = 100 GB - 50 GB - 1GB = 49 GB 
> Min(remaining, available) = 1 GB
> 98% of which is usable for DFS apparently - 
> So, we're at the limit, but are free to use 98% of the remaining 1GB.
> </example>
> this is broke. based on the discussion on 1463 - it seems like the notion of 'capacity'
as being the first field of 'df' is problematic. For example - here's what our df output looks
like:
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda3             130G  123G   49M 100% /
> as u can see - 'Size' is a misnomer - that much space is not available. Rather the actual
usable space is 123G+49M ~ 123G. (not entirely sure what the discrepancy is due to - but have
heard this may be due to space reserved for file system metadata). Because of this discrepancy
- we end up in a situation where file system is out of space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message