Return-Path: X-Original-To: apmail-hadoop-common-dev-archive@www.apache.org Delivered-To: apmail-hadoop-common-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DA324F60F for ; Tue, 19 Mar 2013 21:21:22 +0000 (UTC) Received: (qmail 66548 invoked by uid 500); 19 Mar 2013 21:21:17 -0000 Delivered-To: apmail-hadoop-common-dev-archive@hadoop.apache.org Received: (qmail 66446 invoked by uid 500); 19 Mar 2013 21:21:17 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 66249 invoked by uid 99); 19 Mar 2013 21:21:17 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Mar 2013 21:21:17 +0000 Date: Tue, 19 Mar 2013 21:21:16 +0000 (UTC) From: "Robert Joseph Evans (JIRA)" To: common-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (HADOOP-9419) CodecPool should avoid OOMs with buggy codecs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-9419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans resolved HADOOP-9419. ----------------------------------------- Resolution: Won't Fix Never mind. I created a patch, and it is completely useless in fixing this problem. The tasks still OOM because the codec itself is so small and the MergeManager creates new codecs so quickly that on a job with lots of reduces it literally uses up all of the address space with direct byte buffers. Some of the processes get killed by the NM for going over the virtual address space before they OOM. We could try and have the CodecPool detect that the codec is doing the wrong thing and "correct" it for the codec, but that is too heavy handed in my opinion. > CodecPool should avoid OOMs with buggy codecs > --------------------------------------------- > > Key: HADOOP-9419 > URL: https://issues.apache.org/jira/browse/HADOOP-9419 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Robert Joseph Evans > > I recently found a bug in the gpl compression libraries that was causing map tasks for a particular job to OOM. > https://github.com/omalley/hadoop-gpl-compression/issues/3 > Now granted it does not make a lot of sense for a job to use the LzopCodec for map output compression over the LzoCodec, but arguably other codecs could be doing similar things and causing the same sort of memory leaks. I propose that we do a sanity check when creating a new decompressor/compressor. If the codec newly created object does not match the value from getType... it should turn off caching for that Codec. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira