hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-4382) Run Hadoop sort benchmark on Amazon EC2
Date Wed, 26 Nov 2008 17:40:44 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tom White updated HADOOP-4382:
------------------------------

    Attachment: hadoop-4382.patch

A script that:

1. Launches a cluster on EC2
2. Waits for the cluster and Hadoop daemons to start
3. Runs a small sort job to warm up the cluster
4. Runs a sort job and emits the job duration
5. Terminates the cluster

Running on an 8 node cluster it took 2742 seconds to sort 32GB of data using the default hadoop-site.xml
that the EC2 scripts use. This could be improved by using better settings. 

There are several improvements that could be made to the script, in particular in detecting
when the cluster is ready to go (the current script waits until 90% of the nodes are up then
waits 1 minute for Hadoop to start). There are more ideas here: http://www.nabble.com/Auto-shutdown-for-EC2-clusters-td20132561.html
It would also be good to do multiple runs, discard the first and compute an average.

This should be a good basis for running a regular EC2 benchmark from Hudson.

Comments welcome.

> Run Hadoop sort benchmark on Amazon EC2
> ---------------------------------------
>
>                 Key: HADOOP-4382
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4382
>             Project: Hadoop Core
>          Issue Type: Test
>          Components: contrib/ec2
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-4382.patch
>
>
> By running a benchmark on EC2 we can see how well Hadoop performs, how to tune it, and
how performance changes between releases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message