hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randy Fox <r...@connexity.com>
Subject NodeManager High CPU due to high GC
Date Sat, 23 Jan 2016 01:49:13 GMT
Hi,

We just upgraded to using Yarn on Hadoop 2.6.0 – CDH5.4.5
We are running a large job – 200K mappers, 100K reducers and we can’t get through the
shuffle phase.  The node managers are 800% cpu and high GC.  The reducers get socket timouts
after 1.5 hours of running and only getting a few percent of the data from the mappers.  This
job took about 30 hours total 12 in mappers on MRv1 with no issues.

I have looked for configs that might help or issues filed and anyone that has seen this and
I have come up with nothing.
Anyone have ideas on things to try or explain why the node managers are in GC hell and why
the data is just not flowing from mappers to reducers?

Thanks in advanced,

Randy
Mime
View raw message