hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brahma Reddy Battula <brahmareddy.batt...@huawei.com>
Subject RE: [HDFS-9038] Non-Dfs used Calculation
Date Wed, 20 Apr 2016 10:57:55 GMT
>>> It is incorrect to minus reserved from usage.getAvailable() above since the reserved
space, which is the space reserved for non-hdfs used, may already be occupied by some non-hdfs
files but not necessarily empty space.
You are right. Its incorrect to subtract all reserved space. But we may need to subtract actual
usage by non-dfs files, if its less than reserved. If the non-dfs usage is more than reserved,
then need not subtract.

May be actual confusion started with this description for 'dfs.datanode.du.reserved' in HDFS-5215
"Reserved space in bytes per volume. Always leave this much space free for non dfs use."
Reserved space is for the non-dfs files. HDFS should not use reserved space to store dfs files.

But, if the reserved space is already used by non-dfs files, then HDFS need not care about
reserved anymore in getAvailable().

Considering non-dfs shown in metrics is the unplanned non-dfs usage i.e. extra usage beyond
reserved, I hope below changes might be fine. If okay, then I will update the patch in HDFS-9038
based on this.

---------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
index 0d060f9..451b258 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
@@ -383,7 +383,7 @@ public void setCapacityForTesting(long capacity) {
   @Override
   public long getAvailable() throws IOException {
     long remaining = getCapacity() - getDfsUsed() - reservedForReplicas.get();
-    long available = usage.getAvailable() - reserved
+    long available = usage.getAvailable() - getRemainingReserved()
         - reservedForReplicas.get();
     if (remaining > available) {
       remaining = available;
@@ -391,6 +391,31 @@ public long getAvailable() throws IOException {
     return (remaining > 0) ? remaining : 0;
   }

+  private long getActualNonDfsUsed() throws IOException {
+    return usage.getUsed() - getDfsUsed();
+  }
+
+  private long getRemainingReserved() throws IOException {
+    long actualNonDfsUsed = getActualNonDfsUsed();
+    if (actualNonDfsUsed < reserved) {
+      return reserved - actualNonDfsUsed;
+    }
+    return 0L;
+  }
+
+  /**
+   * Unplanned Non-DFS usage, i.e. Extra usage beyond reserved.
+   * @return
+   * @throws IOException
+   */
+  public long getNonDfsUsed() throws IOException {
+    long actualNonDfsUsed = getActualNonDfsUsed();
+    if (actualNonDfsUsed < reserved) {
+      return 0L;
+    }
+    return actualNonDfsUsed - reserved;
+  }
+
   @VisibleForTesting
   public long getReservedForReplicas() {
     return reservedForReplicas.get();


--Brahma Reddy Battula

-----Original Message-----
From: Ravi Prakash [mailto:ravihadoop@gmail.com] 
Sent: 16 April 2016 09:40
To: hdfs-dev; Tsz Wo Sze
Subject: Re: [HDFS-9038] Non-Dfs used Calculation

I meant HDFS-9530 instead of HDFS-9038. We are seeing an issue where none of the datanodes
are available for writes, and my suspicion is that incorrect calculation of all these numbers
is causing it.

On Fri, Apr 15, 2016 at 6:38 PM, Ravi Prakash <ravihadoop@gmail.com> wrote:

> Hi Nicholas!
>
> Could you please point out exactly which place you are seeing this 
> {{available = usage.getAvailable() - reserved}} calculation? I'm sorry 
> I'm a bit confused because there are several places you could be 
> talking about ( in the patch / in the unpatched NN code / in the unpatched DN code )
.
>
> It seems to me the non-DFS used is only ever used to display a number 
> on a UI , so I would prefer to resolve this sooner so that we can nail 
> down more important issues e.g. HDFS-9038.
>
> Thanks
> Ravi
>
> On Wed, Apr 13, 2016 at 12:06 AM, Tsz Wo Sze 
> <szetszwo@yahoo.com.invalid>
> wrote:
>
>> available = usage.getAvailable() - reserved
>>
>> It is incorrect to minus reserved from usage.getAvailable() above 
>> since the reserved space, which is the space reserved for non-hdfs 
>> used, may already be occupied by some non-hdfs files but not necessarily empty space.
>> In pre HDFS-5215 calculation, the non-DFS used is like "unplanned 
>> non-DFS used" while the "planned DFS used" is the reserved space.
>> Tsz-Wo
>>
>>
>>
>>     On Wednesday, April 13, 2016 2:35 PM, Brahma Reddy Battula < 
>> brahmareddy.battula@huawei.com> wrote:
>>
>>
>>
>>  Gentle Remainder!!
>>
>>
>> --Brahma Reddy Battula
>>
>> From: Brahma Reddy Battula
>> Sent: 28 March 2016 12:26
>> To: hdfs-dev@hadoop.apache.org
>> Cc: 'aagarwal@hortonworks.com'; 'cnauroth@hortonworks.com'; '
>> vinayakumarb@apache.org'
>> Subject: [HDFS-9038] Non-Dfs used Calculation
>>
>> Hi All,
>>
>> Chris Nauroth / Arpit / Vinay and me discussing this calculation.
>>
>> There is a disagreement on the definition of non-DFS used space, 
>> because of which Issue is not making progress.
>> Essentially, it's a question of whether this metric means "Raw 
>> Non-DFS Used" or "Unplanned Non-DFS Used".
>>
>>
>> Here is the summary of the conversation, by Arpit.
>>
>> The pre HDFS-5215 calculation had two bugs.
>>
>>  1. It incorrectly subtracted reserved space from the non-DFS used. 
>> (net negative). Chris suggests this is not really an issue as non-DFS 
>> used should be shown as zero unless it exceeds the DFS reserved value.
>>
>>   2. It used File#getUsableSpace to calculate the volume free space 
>> instead of File#getFreeSpace. (net positive)
>>
>> The net effect was that non-DFS used was displayed as zero unless the 
>> actual non-DFS used exceeded DFS reserved - system reserved.
>>
>> HDFS-5215 fixed the first issue and the value that is now erroneously 
>> counted towards non-DFS used is in fact the system reserved 5%.
>>
>> From the testing it was found that, "Ext derivatives hold back 5% 
>> free space while XFS does not."
>>
>>
>> Proposed calculation to report the exact Non-DFS Usage:
>>
>>   non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
>>               = usage.getCapacity() - reserved + reserved - 
>> getDfsUsed()
>> - totalFreeSpace
>>               = usage.getCapacity() - getDfsUsed() - totalFreeSpace
>>               = File#getTotalSpace - getDfsUsed() - File#getFreeSpace
>>
>> Chris Nauroth thinks we should subtract "dfs.datanode.du.reserved" 
>> for non-dfs used because it allowed  to monitor for unexpected 
>> non-zero non-DFS usage and react.
>>
>> Even Akira given "+0" on above calculation.
>>
>> We would like take inputs from you to see some progress on the issue.
>>
>> Please let me know your thoughts on this issue.
>>
>> Thanks
>> --Brahma Reddy Battula
>>
>>
>>
>>
>>
>
>
Mime
View raw message