hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Plamen Jeliazkov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-4475) OutOfMemory by BPService.offerService() takes down DataNode
Date Wed, 06 Feb 2013 21:57:12 GMT

     [ https://issues.apache.org/jira/browse/HDFS-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Plamen Jeliazkov updated HDFS-4475:
-----------------------------------

    Description: 
In DataNode, there are catchs around BPService.offerService() call but no catch for OutOfMemory
as there is for the DataXeiver as introduced in 0.22.0.

The issue can be replicated like this:
1) Create a cluster of X DataNodes and 1 NameNode and low memory settings (-Xmx128M or something
similar).
2) Flood HDFS with small file creations (any should work actually).
3) DataNodes will hit OoM, stop blockpool service, and shutdown.

The resolution is to catch the OoMException and handle it properly when calling BlockPool.offerService()
in DataNode.java; like as done in 0.22.0 of Hadoop. DataNodes should not shutdown or crash
but remain in a sort of frozen state until memory issues are resolved by GC.

  was:
In DataNode, there are catchs around BPService.offerService() call but no catch for OutOfMemory
as there is for the DataXeiver as introduced in 0.22.0.

The issue can be replicated like this:
1) Create a cluster of X DataNodes and 1 NameNode and low memory settings (-Xmx128M or something
similar).
2) Flood HDFS with of file creation.
3) DataNodes will hit OoM, stop blockpool service, and shutdown.

The resolution is to catch the OoMException and handle it properly when calling BlockPool.offerService()
in DataNode.java; like as done in 0.22.0 of Hadoop. DataNodes should not shutdown or crash
but remain in a sort of frozen state until memory issues are resolved by GC.

    
> OutOfMemory by BPService.offerService() takes down DataNode
> -----------------------------------------------------------
>
>                 Key: HDFS-4475
>                 URL: https://issues.apache.org/jira/browse/HDFS-4475
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.0.3-alpha
>            Reporter: Plamen Jeliazkov
>            Assignee: Plamen Jeliazkov
>             Fix For: 3.0.0, 2.0.3-alpha
>
>
> In DataNode, there are catchs around BPService.offerService() call but no catch for OutOfMemory
as there is for the DataXeiver as introduced in 0.22.0.
> The issue can be replicated like this:
> 1) Create a cluster of X DataNodes and 1 NameNode and low memory settings (-Xmx128M or
something similar).
> 2) Flood HDFS with small file creations (any should work actually).
> 3) DataNodes will hit OoM, stop blockpool service, and shutdown.
> The resolution is to catch the OoMException and handle it properly when calling BlockPool.offerService()
in DataNode.java; like as done in 0.22.0 of Hadoop. DataNodes should not shutdown or crash
but remain in a sort of frozen state until memory issues are resolved by GC.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message