hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2409) Make EC2 image independent of Hadoop version
Date Fri, 18 Apr 2008 10:48:21 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12590380#action_12590380

Steve Loughran commented on HADOOP-2409:

the idea would be that when the user issues a "yum update" (assuming an RPM-based distro),
and the hadoop RPM would be updated (along with any other patches). better yet, you go "yum
update hadoop-on-s3" and get an update of that stuff only, as a full update may have adverse
side effects ( see http://www.1060.org/blogxter/entry?publicid=0C5798DE7C14EE57D8BEA1E1E945872E

The nice thing about this approach is that it integrate with the linux ecosystem; people can
uninstall you cleanly, and the OS can stop your files getting stamped on.

Have I done this? Well, we release RPMs, which get built on Linux using <rpmbuild>;
these then get handed off to other people to put in their repositories. So I dont know how
to set up a Yum-compatible file system. I do know how to build RPMs under Ant, taking .spec
files and setting them up. Its painful, but once you get the hang of things not too hard.
You do just need a clean RPM-based VM around to test your installation on, which is where
EC2, VMware or Xen come into the picture...we test locally on VMWare, but now I can start/stop
EC2 images during tests that could be targeted directly.

The big issue is the engineering effort to create the RPMs, to write the tests and maintain
the .spec files. Surely there must be people out there who create there own Hadoop RPMs? Ideally
we'd take existing work -such as a hadoop-core RPM, and add a new hadoop-on-ec2 RPM that depended
on the base RPMs

> Make EC2 image independent of Hadoop version
> --------------------------------------------
>                 Key: HADOOP-2409
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2409
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/ec2
>            Reporter: Tom White
>         Attachments: HADOOP-2409.patch
> Instead of building a new image for each released version of Hadoop, install Hadoop on
instance start up. Since it is a small download this would not add significantly to startup
time. Hadoop releases would be mirrored on S3 for scalability (and to avoid bandwidth costs).
The version to install would be found from the instance metadata - this would be a download
> More generally, the instance could retrieve a script to run on start up from a URL specified
in the metadata. The script would install and configure Hadoop, but it could be extended to
do cluster-specific set up.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message