hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HADOOP-9419) CodecPool should avoid OOMs with buggy codecs
Date Tue, 19 Mar 2013 21:21:16 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-9419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Joseph Evans resolved HADOOP-9419.
-----------------------------------------

    Resolution: Won't Fix

Never mind.  I created a patch, and it is completely useless in fixing this problem.  The
tasks still OOM because the codec itself is so small and the MergeManager creates new codecs
so quickly that on a job with lots of reduces it literally uses up all of the address space
with direct byte buffers.  Some of the processes get killed by the NM for going over the virtual
address space before they OOM. We could try and have the CodecPool detect that the codec is
doing the wrong thing and "correct" it for the codec, but that is too heavy handed in my opinion.
                
> CodecPool should avoid OOMs with buggy codecs
> ---------------------------------------------
>
>                 Key: HADOOP-9419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9419
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Robert Joseph Evans
>
> I recently found a bug in the gpl compression libraries that was causing map tasks for
a particular job to OOM.
> https://github.com/omalley/hadoop-gpl-compression/issues/3
> Now granted it does not make a lot of sense for a job to use the LzopCodec for map output
compression over the LzoCodec, but arguably other codecs could be doing similar things and
causing the same sort of memory leaks.  I propose that we do a sanity check when creating
a new decompressor/compressor.  If the codec newly created object does not match the value
from getType... it should turn off caching for that Codec.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message