hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6108) Add support for EBS storage on EC2
Date Wed, 25 Nov 2009 19:10:39 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782559#action_12782559

Aaron Kimball commented on HADOOP-6108:

> README.txt
> 34] you seem to pick an arbitrary AMI. How's this chosen? Is there a list of appropriate
AMIS somewhere?

I think listing the AMIs on a wiki page is probably best, since these may be updated from
time to time. AMIs simply need to run scripts passed as (compressed) user data and have Java
installed. It would be a good follow up issue to include scripts to create these AMIs. 

Agree. Maybe create this wiki page and reference that in the README?

> 25) Is it really necessary to source ~/.bashrc? The /bin/bash on the shebang line should
take care of this.

Only for interactive shells. I've run these scripts using Hudson, and this is then needed.

Hm. I'm under the impression that .bashrc is for interactive, non-login shells only. (From
reading the INVOCATION section of the bash manpage.) Whereas .bash_profile is for interactive
login shells. If .bash_profile specifies AWS_ACCESS_KEY_ID, etc, and .bashrc specifies a different
one, then users who have already configured their AWS credentials in .bash_profile will see
their environment shadowed. (I admit this case is rare, but the two files are specified as
different for a reason.) A more plausible purpose is that users specify a default set of AWS
credentials in .bashrc, but sometimes override them on the command-line before invoking this
script. So I'm not convinced this is a good general-purpose solution. Hudson is likely using
a non-interactive non-login shell to invoke its scripts, in which case it should either specify
AWS credentials in the environment it explicitly passes to subshells, or should specify the
config script via the {{$BASH_ENV}} variable. (e.g.: {{BASH_ENV=/home/hudson/.bashrc /path/to/run-your-test-script-here}})

> 346) since we're already parsing a config file, can the proxy port be configurable? What
if the user's got two clusters running simultaneously?

I agree that this could be a limitation for some users. I'd like to tackle this in a follow-up


.. Everything else looks fine. Thanks for adding more comments. I'm +.9 as-is; would prefer
to see the bashrc hack addressed though.

> Add support for EBS storage on EC2
> ----------------------------------
>                 Key: HADOOP-6108
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6108
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: contrib/ec2
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: HADOOP-6108.patch, HADOOP-6108.patch, HADOOP-6108.patch
> By using EBS for namenode and datanode storage we can have persistent, restartable Hadoop
clusters running on EC2.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message