hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-2409) Make EC2 image independent of Hadoop version
Date Tue, 10 Feb 2009 16:29:02 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Tom White updated HADOOP-2409:

    Attachment: hadoop-2409.patch

I've written a patch to make it easy to install Hadoop on startup, while still supporting
builds of AMIs that have it pre-installed.

* The default behavior is to download the specified version of Hadoop from S3 and install
it on startup. (Hadoop releases will be mirrored in a publicly readable S3 bucket.) The download
takes around 1 second, so there is no performance overhead in doing this. This makes it easy
for people to use patched versions of Hadoop by uploading them to their own S3 bucket and
changing a config setting.
* The scripts for creating an AMI retain the ability to install Hadoop so you don't have to
install it on start up. I propose that the publicly available AMIs do not have Hadoop installed,
and install it on start up, so they don't have to be regenerated on each Hadoop release. This
also makes it feasible to install other Hadoop software in the future, and avoid the combinatorial
explosion of version numbers.
* I've created AMIs that work with these changes (hadoop-images/hadoop-base-20090210-i386.manifest.xml
and hadoop-images/hadoop-base-20090210-x86_64.manifest.xml)
* I have also fixed HADOOP-5006 as a part of this change, which allowed me to see Ganglia

> Make EC2 image independent of Hadoop version
> --------------------------------------------
>                 Key: HADOOP-2409
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2409
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/ec2
>            Reporter: Tom White
>         Attachments: hadoop-2409.patch, HADOOP-2409.patch
> Instead of building a new image for each released version of Hadoop, install Hadoop on
instance start up. Since it is a small download this would not add significantly to startup
time. Hadoop releases would be mirrored on S3 for scalability (and to avoid bandwidth costs).
The version to install would be found from the instance metadata - this would be a download
> More generally, the instance could retrieve a script to run on start up from a URL specified
in the metadata. The script would install and configure Hadoop, but it could be extended to
do cluster-specific set up.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message