hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Parks" <davidpark...@yahoo.com>
Subject RE: [Bulk] Re: Failed To Start SecondaryNameNode in Secure Mode
Date Tue, 04 Dec 2012 08:00:07 GMT
I'm curious about profiling, I see some documentation about it (1.0.3 on
AWS), but the references to JobConf seem to be for the "old api" and I've
got everything running on the "new api".

 

I've got a job to handle processing of about 30GB of compressed CSVs and
it's taking over a day with 3 m1.medium boxes, more than I expected, so I'd
like to see where the time is being spent.

 

http://hadoop.apache.org/docs/r1.0.3/mapred_tutorial.html#Profiling

 

I've never set up any kind of profiling, so I don't really know what to
expect here.

 

Any pointers to help me set up what's suggested here? Am I correct in
understanding that this doc is a little outdated?


Mime
View raw message