hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Evans <ev...@yahoo-inc.com>
Subject Re: issue with map running time
Date Fri, 06 Jul 2012 17:00:08 GMT
How long a program takes to run depends on a lot of things.  It could be a connectivity issue,
or it could be that your program does a lot more processing for some input records then for
others, or it could be that some of your records are a lot smaller so that more of them exist
in a single input split.  Without knowing what the code is doing it is hard to say more then
that.

--Bobby Evans

From: Kasi Subrahmanyam <kasisubbu440@gmail.com<mailto:kasisubbu440@gmail.com>>
Reply-To: "mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.org>"
<mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.org>>
To: "mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.org>" <mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.org>>
Subject: issue with map running time

Hi ,

I have a job which has let us say 10 mappers running in parallel.
Some are running fast but few of them are taking too long to run.
For example few mappers are taking 5 to 10 mins but others are taking around 12 hours or more.
Does the difference in the data handled by the mappers can cause such a variation or is it
the issue with connectivity.

Note:The cluster we are using have multiple users running their jobs on it.

Thanks in advance.
Subbu

Mime
View raw message