hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "JichengSong (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-7592) A bug in BlocksMap that cause NameNode memory leak.
Date Thu, 08 Jan 2015 10:30:35 GMT

     [ https://issues.apache.org/jira/browse/HDFS-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

JichengSong updated HDFS-7592:
------------------------------
    Description: 
In our HDFS production environment, NameNode FGC frequently after running for 2 months, we
have to restart NameNode manually.
We dumped NameNode's Heap for objects statistics.
Before restarting NameNode:
    num #instances #bytes class name
    ----------------------------------------------
        1: 59262275 3613989480 [Ljava.lang.Object;
            ...
        10: 8549361 615553992 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
        11: 5941511 427788792 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
After restarting NameNode:
    num #instances #bytes class name
    ----------------------------------------------
         1: 44188391 2934099616 [Ljava.lang.Object;
              ...
        23: 721763 51966936 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
        24: 620028 44642016 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
We find the number of BlockInfoUnderConstruction is abnormally large before restarting NameNode.
As we know, BlockInfoUnderConstruction keeps block state when the file is being written. But
the write pressure of
our cluster is far less than million/sec. We think there is a memory leak in NameNode.
We fixed the bug as followsing patch.
diff --git a/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java b/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java
index 7a40522..857d340 100644
--- a/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java
+++ b/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java
@@ -205,6 +205,8 @@ class BlocksMap {
       DatanodeDescriptor dn = currentBlock.getDatanode(idx);
       dn.replaceBlock(currentBlock, newBlock);
     }
+    // change to fix bug about memory leak of NameNode
+    map.remove(newBlock);
     // replace block in the map itself
     map.put(newBlock, newBlock);
     return newBlock;

  was:
In our HDFS production environment, NameNode FGC frequently after running for 2 months, we
have to restart NameNode manually.
We dumped NameNode's Heap for objects statistics.
Before restarting NameNode:
    num #instances #bytes class name
    ----------------------------------------------
        1: 59262275 3613989480 [Ljava.lang.Object;
            ...
        10: 8549361 615553992 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
        11: 5941511 427788792 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
After restarting NameNode:
    num #instances #bytes class name
    ----------------------------------------------
         1: 44188391 2934099616 [Ljava.lang.Object;
              ...
        23: 721763 51966936 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
        24: 620028 44642016 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
We find the number of BlockInfoUnderConstruction is abnormally large before restarting NameNode.
As we know, BlockInfoUnderConstruction keeps block state when the file is being written. But
the write pressure of
our cluster is far less than million/sec. We think there is a memory leak in NameNode.
We fixed the bug as followsing patch.

--- src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java      (reversion 1640066)
+++ src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java
@@ -205,6 +205,8 @@
       DatanodeDescriptor dn = currentBlock.getDatanode(idx);
       dn.replaceBlock(currentBlock, newBlock);
     }
+    // change to fix bug about memory leak of NameNode
+    map.remove(newBlock);
     // replace block in the map itself
     map.put(newBlock, newBlock);


> A bug in BlocksMap that  cause NameNode  memory leak.
> -----------------------------------------------------
>
>                 Key: HDFS-7592
>                 URL: https://issues.apache.org/jira/browse/HDFS-7592
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 0.21.0
>         Environment: HDFS-0.21.0
>            Reporter: JichengSong
>            Assignee: JichengSong
>              Labels: BlocksMap, leak, memory
>             Fix For: 0.21.0
>
>         Attachments: HDFS-7592.patch
>
>
> In our HDFS production environment, NameNode FGC frequently after running for 2 months,
we have to restart NameNode manually.
> We dumped NameNode's Heap for objects statistics.
> Before restarting NameNode:
>     num #instances #bytes class name
>     ----------------------------------------------
>         1: 59262275 3613989480 [Ljava.lang.Object;
>             ...
>         10: 8549361 615553992 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
>         11: 5941511 427788792 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
> After restarting NameNode:
>     num #instances #bytes class name
>     ----------------------------------------------
>          1: 44188391 2934099616 [Ljava.lang.Object;
>               ...
>         23: 721763 51966936 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
>         24: 620028 44642016 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
> We find the number of BlockInfoUnderConstruction is abnormally large before restarting
NameNode.
> As we know, BlockInfoUnderConstruction keeps block state when the file is being written.
But the write pressure of
> our cluster is far less than million/sec. We think there is a memory leak in NameNode.
> We fixed the bug as followsing patch.
> diff --git a/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java b/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java
> index 7a40522..857d340 100644
> --- a/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java
> +++ b/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java
> @@ -205,6 +205,8 @@ class BlocksMap {
>        DatanodeDescriptor dn = currentBlock.getDatanode(idx);
>        dn.replaceBlock(currentBlock, newBlock);
>      }
> +    // change to fix bug about memory leak of NameNode
> +    map.remove(newBlock);
>      // replace block in the map itself
>      map.put(newBlock, newBlock);
>      return newBlock;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message