hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randy Fox <r...@connexity.com>
Subject Re: NodeManager High CPU due to high GC
Date Sat, 23 Jan 2016 17:53:32 GMT
24 virtual cores and we allocated 22 for Yarn

From: Daniel Haviv
Date: Saturday, January 23, 2016 at 4:00 AM
To: Randy Fox
Cc: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
Subject: Re: NodeManager High CPU due to high GC

Hi Randy,
How much cores do you have on your machines and how much did you allocate for Yarn?


On Saturday, 23 January 2016, Randy Fox <rfox@connexity.com<mailto:rfox@connexity.com>>

We just upgraded to using Yarn on Hadoop 2.6.0 – CDH5.4.5
We are running a large job – 200K mappers, 100K reducers and we can’t get through the
shuffle phase.  The node managers are 800% cpu and high GC.  The reducers get socket timouts
after 1.5 hours of running and only getting a few percent of the data from the mappers.  This
job took about 30 hours total 12 in mappers on MRv1 with no issues.

I have looked for configs that might help or issues filed and anyone that has seen this and
I have come up with nothing.
Anyone have ideas on things to try or explain why the node managers are in GC hell and why
the data is just not flowing from mappers to reducers?

Thanks in advanced,

View raw message