Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 928EED105 for ; Thu, 6 Sep 2012 17:13:27 +0000 (UTC) Received: (qmail 48207 invoked by uid 500); 6 Sep 2012 17:13:22 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 48064 invoked by uid 500); 6 Sep 2012 17:13:22 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Delivered-To: moderator for user@hadoop.apache.org Received: (qmail 46045 invoked by uid 99); 6 Sep 2012 15:31:59 -0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sortega@tid.es designates 195.235.93.200 as permitted sender) Date: Thu, 06 Sep 2012 15:31:31 +0000 From: SEBASTIAN ORTEGA TORRES Subject: Re: [Cosmos-dev] Out of memory in identity mapper? In-reply-to: X-Originating-IP: [10.95.64.115] To: Harsh J Cc: "" , "cosmos-dev@tid.es" Message-id: <2C4C1A0E-7C6C-4BF3-A45E-9083A0F629EA@tid.es> MIME-version: 1.0 Content-type: multipart/alternative; boundary="Boundary_(ID_LkTLBY09YoLln70kREnNMA)" Content-language: en-US Accept-Language: en-US Thread-topic: [Cosmos-dev] Out of memory in identity mapper? Thread-index: Ac2MPtFTpbtnuYi1ROedtfSGhgspF///5R0AgAAFIYA= X-AuditID: 0a5f4e69-b7f0e6d000005457-c6-5048c1d38bec X-MS-Has-Attach: X-MS-TNEF-Correlator: X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrBKsWRmVeSWpSXmKPExsXCFe9nqHv5oEeAwbqTZhY9U6axODB6TOja whjAGMVlk5Kak1mWWqRvl8CV8f6ybsFnl4rfL66xNDBete5i5OSQEDCRuPDjNSuELSZx4d56 ti5GLg4hge2MEo8PNLJAOI1MEl8mtbBDOEsZJc7cfQLUwsHBIqAqcWdROkg3m4CexOMfK5lA bGEBa4mrC1ewg9icAsESB+/OZITYoCDx59xjFpBWESD71pUwkDCzQJLE2d+Xwcp5BSwlvk1p ZISwBSV+TL7HAlETLbF6+z4mCFtcorn1JlicEejo76fWgMVFBGwkfj97D2VbSXxd/gNqrYDE kj3nmSFsUYmXj/+xQrwygVGi7c0DpgmMYrOQ7JuFZN8sJPsgbD2JG1OnsEHY2hLLFr5mhrB1 JWb8OwRVYyYx8XUvM7KaBYwcqxjFipOKMtMzSnITM3PSDYz0MjL1MvNSSzYxQqIxcwfj8p0q hxgFOBiVeHhXlrgHCLEmlhVX5h5ilORgUhLlfbHPI0CILyk/pTIjsTgjvqg0J7X4EKMEB7OS CG99NVCONyWxsiq1KB8mJcPBoSTBywJMHEKCRanpqRVpmTnAlAOTZuLgBGnnAWr/dwCkvbgg Mbc4Mx0if4pRUkqcVxykWQAkkVGaB9f7ilEc6Ehh3rsgbTzA5AjX9QpoIBPQwOAUsIEliQgp qQbGuW4LDmguvT/xaa1R6Zbu75LnhLN60n0q9u+I57JWtZh08Lzo6z0nJSpW8bmybWG7uEpm 9uuF6kyLUno1rvzs1f72oOS/9MG9z/9n5WyTy+oJFngVKR21M1JczvT5iXt2cafXT2QRTLAU X39l3o61muKzE6d/+/dV2XLmPr9/2Vd+LXXUlvU/qMRSnJFoqMVcVJwIALHM9JJLAwAA References: X-Virus-Checked: Checked by ClamAV on apache.org --Boundary_(ID_LkTLBY09YoLln70kREnNMA) Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: quoted-printable Input files are small fixed-size protobuf records and yes, it is reproducib= le (but it takes some time). In this case I cannot use combiners since I need to process all the element= s with the same key altogether. Thanks for the prompt response -- Sebasti=E1n Ortega Torres Product Development & Innovation / Telef=F3nica Digital C/ Don Ram=F3n de la Cruz 82-84 Madrid 28006 El 06/09/2012, a las 17:13, Harsh J escribi=F3: I can imagine a huge record size possibly causing this. Is this reliably reproducible? Do you also have combiners enabled, which may run the reducer-logic on the map-side itself? On Thu, Sep 6, 2012 at 8:20 PM, JOAQUIN GUANTER GONZALBEZ > wrote: Hello hadoopers! In a reduce-only Hadoop job input files are handled by the identity mapper and sent to the reducers without modification. In one of my job I was surprised to see the job failing in the map phase with "Out of memory error= " and "GC overhead limit exceeded". In my understanding, a memory leak on the identity mapper is out of the question. What can be the cause of such error? Thanks, Ximo. P.S. The logs show no stack trace other than the messages I mentioned before. ________________________________ Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra pol=EDtica de env=EDo y recepci=F3n de correo electr=F3nico en el e= nlace situado m=E1s abajo. This message is intended exclusively for its addressee. We only send and receive email on the basis of the terms set out at: http://www.tid.es/ES/PAGINAS/disclaimer.aspx -- Harsh J _______________________________________________ Cosmos-dev mailing list Cosmos-dev@tid.es https://listas.tid.es/mailman/listinfo/cosmos-dev ________________________________ Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nu= estra pol=EDtica de env=EDo y recepci=F3n de correo electr=F3nico en el enl= ace situado m=E1s abajo. This message is intended exclusively for its addressee. We only send and re= ceive email on the basis of the terms set out at: http://www.tid.es/ES/PAGINAS/disclaimer.aspx --Boundary_(ID_LkTLBY09YoLln70kREnNMA) Content-id: <1DB391F2F250FC409E03554E97A45A22@hi.inet> Content-type: text/html; charset=iso-8859-1 Content-transfer-encoding: quoted-printable
Input files are small fixed-size protobuf records and yes, it is repro= ducible (but it takes some time).
In this case I cannot use combiners since I need to process all the el= ements with the same key altogether.

Thanks for the prompt response

--
Sebasti=E1n Ortega Torres
Product Development & Innovation / Telef=F3nica Digital
C/ Don Ram=F3n de la Cruz 82-84
Madrid 28006






El 06/09/2012, a las 17:13, Harsh J escribi=F3:

I can imagine a huge record size possibly causing this. Is this
reliably reproducible? Do you also have combiners enabled, which may
run the reducer-logic on the map-side itself?

On Thu, Sep 6, 2012 at 8:20 PM, JOAQUIN GUANTER GONZALBEZ <ximo@tid.es> wrote:
Hello hadoopers!



In a reduce-only Hadoop job input files are handl= ed by the identity mapper
and sent to the reducers without modification. In= one of my job I was
surprised to see the job failing in the map phase= with "Out of memory error"
and "GC overhead limit exceeded".



In my understanding, a memory leak on the identit= y mapper is out of the
question.

What can be the cause of such error?



Thanks,

Ximo.



P.S. The logs show no stack trace other than the = messages I mentioned
before.


________________________________
Este mensaje se dirige exclusivamente a su destin= atario. Puede consultar
nuestra pol=EDtica de env=EDo y recepci=F3n de co= rreo electr=F3nico en el enlace
situado m=E1s abajo.
This message is intended exclusively for its addr= essee. We only send and
receive email on the basis of the terms set out a= t:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx



--
Harsh J

_______________________________________________
Cosmos-dev mailing list
Cosmos-dev@tid.es
https://listas.tid.es/mailman/listinfo/cosmos-dev




Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nu= estra pol=EDtica de env=EDo y recepci=F3n de correo electr=F3nico en el enl= ace situado m=E1s abajo.
This message is intended exclusively for its addressee. We only send and re= ceive email on the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx
--Boundary_(ID_LkTLBY09YoLln70kREnNMA)--