hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SEBASTIAN ORTEGA TORRES <sort...@tid.es>
Subject Re: [Cosmos-dev] Out of memory in identity mapper?
Date Thu, 06 Sep 2012 16:22:20 GMT
There is no trace to check on the task, I get "n/a" instead of the links to the traces in the
web.
Some of the maps are retried successfully while others fail again until one of them fails
four times in a row and the job is automatically terminated. Is this compatible with protobuf
corruption? In that case subsequent attempts would fail consistently.

What is reproducible is the overall job failure not individual map failures. I'm sorry for
confusing you.

--
Sebastián Ortega Torres
Product Development & Innovation / Telefónica Digital
C/ Don Ramón de la Cruz 82-84
Madrid 28006






El 06/09/2012, a las 18:12, Harsh J escribió:

Protobuf involvement makes me more suspicious that this is possibly a
corruption or an issue with serialization as well. Perhaps if you can
share some stack traces, people can help better. If it is reliably
reproducible, then I'd also check for the count of records until after
this occurs, and see if the stacktraces are always same.

Serialization formats such as protobufs allocate objects based on read
sizes (like for example, a string size may be read first before the
string's bytes are read, and upon size read, such a length is
pre-allocated for the bytes to be read into), and in cases of corrupt
data or bugs in the deserialization code, it is quite easy for it to
make a large alloc request due to a badly read value. Its one
possibility.

Is the input compressed too, btw? Can you seek out the input file the
specific map fails on, and try to read it in an isolated manner to
validate it? Or do all maps seem to fail?

On Thu, Sep 6, 2012 at 9:01 PM, SEBASTIAN ORTEGA TORRES <sortega@tid.es<mailto:sortega@tid.es>>
wrote:
Input files are small fixed-size protobuf records and yes, it is
reproducible (but it takes some time).
In this case I cannot use combiners since I need to process all the elements
with the same key altogether.

Thanks for the prompt response

--
Sebastián Ortega Torres
Product Development & Innovation / Telefónica Digital
C/ Don Ramón de la Cruz 82-84
Madrid 28006






El 06/09/2012, a las 17:13, Harsh J escribió:

I can imagine a huge record size possibly causing this. Is this
reliably reproducible? Do you also have combiners enabled, which may
run the reducer-logic on the map-side itself?

On Thu, Sep 6, 2012 at 8:20 PM, JOAQUIN GUANTER GONZALBEZ <ximo@tid.es<mailto:ximo@tid.es>>
wrote:

Hello hadoopers!




In a reduce-only Hadoop job input files are handled by the identity mapper

and sent to the reducers without modification. In one of my job I was

surprised to see the job failing in the map phase with "Out of memory error"

and "GC overhead limit exceeded".




In my understanding, a memory leak on the identity mapper is out of the

question.


What can be the cause of such error?




Thanks,


Ximo.




P.S. The logs show no stack trace other than the messages I mentioned

before.



________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar

nuestra política de envío y recepción de correo electrónico en el enlace

situado más abajo.

This message is intended exclusively for its addressee. We only send and

receive email on the basis of the terms set out at:

http://www.tid.es/ES/PAGINAS/disclaimer.aspx




--
Harsh J

_______________________________________________
Cosmos-dev mailing list
Cosmos-dev@tid.es<mailto:Cosmos-dev@tid.es>
https://listas.tid.es/mailman/listinfo/cosmos-dev



________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
nuestra política de envío y recepción de correo electrónico en el enlace
situado más abajo.
This message is intended exclusively for its addressee. We only send and
receive email on the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx



--
Harsh J


________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política
de envío y recepción de correo electrónico en el enlace situado más abajo.
This message is intended exclusively for its addressee. We only send and receive email on
the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx

Mime
View raw message