hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Laurent <organicveg...@gmail.com>
Subject Re: Jobs run slower and slower
Date Wed, 04 Mar 2009 16:44:07 GMT
On Tue, Mar 3, 2009 at 10:14 PM, Amar Kamat <amarrk@yahoo-inc.com> wrote:

> Yeah. May be its not the problem with the JobTracker. Can you check (via
> job history) what is the best and the worst task runtimes? You can analyze
> the jobs after they complete.
> Amar

Okay, I ran the same job 35 times last night. Each job was exactly identical
- it parsed 1000 identical files that were already stored in HDFS via a map
task (no reduce). Like all of my previous tests, each successive run took
longer than the previous run.

Looking at the job history, the first run was the fastest; it took a total
of 2mins 28sec (setup: 2 secs, map: 2min 22sec, cleanup: 0sec). The last run
was the slowest; it took a total of 22mins 31sec (setup: 16sec, map: 22mins
14sec, cleanup: 16sec).

Memory usage on the JT/NN machine, as reported by sar, slowly increased over
the 7 hour window. Memory usage on a randomly selected DN/TT also steadily
increased over the 7 hour window but far more rapidly. We also looked at I/O
usage and CPU utilization on both the JT/NN machine and the same randomly
selected DN/TT - nothing out of the ordinary. I/O waits (both from the I/O
subsystem level perspective and from the CPU's perspective) were
consistently low over the 7 hour window and did not fluctuate significantly
on any of the machines. CPU utilization on the JT/NN was practically
non-existent and hovered between 40%-60% on the DN/TT.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message