hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Laxman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3600) DataNode is not responding After throwing java.lang.OutOfMemoryError: Direct buffer memory
Date Wed, 04 Jul 2012 06:40:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406304#comment-13406304
] 

Laxman commented on HDFS-3600:
------------------------------

bq. -XX:+DisableExplicitGC
This could be the possible culprit.

Generally I noticed these kind of problems in DataNode and RegionServer processes.
In these processes, native memory used heavily used via NIO and I have seen RegionServer(HBase)
process consuming around 20+ GB of memory although its max heap is configured to 4GB (-Xmx)

So, in order to keep the memory footprint(VIRT & RES values) in control, we need to configure
MaxDirectMemorySize. At the same time, I observed that this direct memory is not part of heap
and is getting collected with FullGC (When it reaches the limit or rmi server dgc interval)
only.

To conclude, configure MaxDirectMemorySize but DONT use DisableExplicitGC.

@Brahma, can you please post your findings after removing this flag (DisableExplicitGC).
                
> DataNode is not responding After throwing java.lang.OutOfMemoryError: Direct buffer memory
> ------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3600
>                 URL: https://issues.apache.org/jira/browse/HDFS-3600
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 2.0.1-alpha
>            Reporter: Brahma Reddy Battula
>
> Scenario:
> =========
> Started NN with four DN's
> written client program such that it will keep on write,append and read dta with 10 thread.
> After 4 hours ,got OOME.Then DN listed under Dead it's not sending any heartbeats but
GC is happening.
>  *GC OPTS configured for DN* 
> -Xms3G -Xmx4G -XX:NewSize=256M -XX:MaxNewSize=512M -XX:PermSize=128M -XX:MaxPermSize=128M
-XX:CMSFullGCsBeforeCompaction=1 -XX:MaxDirectMemorySize=1G -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
-XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=65 -Xloggc:/home/install/hadoop/datanode/logs/datanode-root-gc.log
-XX:+PrintGCDetails -XX:+DisableExplicitGC
>  
>  *CPU usage for DN* 
> {noformat}
> Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
> Cpu(s):  0.1%us,  0.0%sy,  0.0%ni, 99.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Mem:     15955M total,    15806M used,      148M free,      436M buffers
> Swap:    12284M total,        9M used,    12274M free,    11422M cached
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                
             
>  7431 root      20   0 6291m 2.6g  13m S    0 16.4  55:20.65 java  
> {noformat}
>  *JAVA Version* 
> {noformat}
> sun.boot.library.path = /root/nodesetup/java/jdk1.6.0_31/jre/lib/amd64
> java version "1.6.0_31"
> Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message