hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Khaled Elmeleegy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5632) Jobtracker leaves tasktrackers underutilized
Date Sun, 07 Jun 2009 19:25:07 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717086#action_12717086

Khaled Elmeleegy commented on HADOOP-5632:

To test the scalability of the patch, i.e. whether the patched JT will be able to keep up
with the load in a large cluster or not, I've used ~200 nodes cluster. Each node has two quad
core CPUs, 16GB of memory, and 4 disks. On each node, I ran 10 TaskTrackers, simulating ~2000
node cluster. Each TT has 6 map slots and 2 reduce slots. I patched the hadoop trunk with
the 5632 patch. I ran the sleep job from the examples jar with 500,000 maps and map runtime
(sleep time) of 20 seconds. I measured the slot utilization and it was 99.8%. I followed the
CPU utilization at the jobtracker and all the 8 cpus were like 20% busy on average. Jobtracker's
CPU utilization varies along time, but the CPUs are no where near saturation. Also, I didn't
observe lock contention, i.e. Load was, more or less, evenly balanced among all the CPUs.

I cranked up the load to stress test the JT by reducing the map runtime (sleep time) to 1
second. Still, the JT was able to keep up with the load with no problem.

> Jobtracker leaves tasktrackers underutilized
> --------------------------------------------
>                 Key: HADOOP-5632
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5632
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.18.0, 0.18.1, 0.18.2, 0.18.3, 0.19.0, 0.19.1, 0.20.0
>         Environment: 2x HT 2.8GHz Intel Xeon, 3GB RAM, 4x 250GB HD linux boxes, 100 node
>            Reporter: Khaled Elmeleegy
>         Attachments: hadoop-khaled-tasktracker.10s.uncompress.timeline.pdf, hadoop-khaled-tasktracker.150ms.uncompress.timeline.pdf,
jobtracker.patch, jobtracker20.patch
> For some workloads, the jobtracker doesn't keep all the slots utilized even under heavy

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message