hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Laurent <organicveg...@gmail.com>
Subject Jobs run slower and slower
Date Tue, 03 Mar 2009 00:46:38 GMT
Hi all,
I'm conducting some initial tests with Hadoop to better understand how well
it will handle and scale with some of our specific problems. As a result,
I've written some M/R jobs that are representative of the work we want to
do. I then run the jobs multiple times in a row (sequentially) to get a
rough estimate for average run-time.

What I'm seeing is really strange... If I run the same job with the same
inputs multiple times, each successive run is slower than the previous run.
If I restart the cluster and re-run the tests, the first run is fast and
then each successive run is slower.

For example, I just started the cluster and ran the same job 4 times. The
run times for the jobs were as follows: 127 seconds, 177 seconds, 207
seconds and 218 seconds. I restarted HDFS and M/R, reran the job 3 more
times and got the following run times: 138 seconds, 187 seconds and 221
seconds. :(

The map task is pretty simple - parse XML files to extract specific
elements. I'm using Cascading and wrote a custom Scheme, which in turn uses
a custom FileInputFormat that treats each file as an entire record
(splitable = false). Each file is then treated as a separate map task with
no reduce step.

In this case I have a 8 node cluster. 1 node acts as a dedicated
NameNode/JobTracker and 7 nodes run the DataNode/TaskTracker. Each machine
is identical: Dell 1950 with Intel quad-core 2.5, 8GB RAM and 2 250GB SATA2
drives. All 8 machines are in the same rack running on a dedicated Force10
gigabit switch.

I tried enabling JVM reuse via JobConf, which improved performance for the
initial few runs... but each successive job still took longer than the
previous. I also tried increasing the maximum memory via the
mapred.child.java.opts property, but that didn't have any impact.

I checked the logs, but I don't see any errors.

Here's my basic list of configured properties:


Frankly I'm stumped. I'm sure there is something obvious that I'm missing,
but I'm totally at a loss right now. Any suggestions would be ~greatly~



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message