Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 60253 invoked from network); 29 Sep 2008 19:56:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 29 Sep 2008 19:56:56 -0000 Received: (qmail 70944 invoked by uid 500); 29 Sep 2008 19:56:49 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 70906 invoked by uid 500); 29 Sep 2008 19:56:49 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 70895 invoked by uid 99); 29 Sep 2008 19:56:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Sep 2008 12:56:49 -0700 X-ASF-Spam-Status: No, hits=-2.0 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of swatt@us.ibm.com designates 32.97.110.151 as permitted sender) Received: from [32.97.110.151] (HELO e33.co.us.ibm.com) (32.97.110.151) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Sep 2008 19:55:45 +0000 Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by e33.co.us.ibm.com (8.13.8/8.13.8) with ESMTP id m8TJJrPx024865 for ; Mon, 29 Sep 2008 15:19:53 -0400 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v9.1) with ESMTP id m8TJJrrJ203474 for ; Mon, 29 Sep 2008 13:19:53 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m8TJJr93023665 for ; Mon, 29 Sep 2008 13:19:53 -0600 Received: from d03nm123.boulder.ibm.com (d03nm123.boulder.ibm.com [9.17.195.149]) by d03av04.boulder.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id m8TJJrdc023661 for ; Mon, 29 Sep 2008 13:19:53 -0600 To: core-user@hadoop.apache.org MIME-Version: 1.0 Subject: Getting Hadoop Working on EC2/S3 X-KeepSent: 6E3E248C:A6371A83-862574D3:006957DC; type=4; name=$KeepSent X-Mailer: Lotus Notes Release 8.0.1 February 07, 2008 Message-ID: From: Stephen Watt Date: Mon, 29 Sep 2008 14:19:51 -0500 X-MIMETrack: Serialize by Router on D03NM123/03/M/IBM(Release 8.0.1|February 07, 2008) at 09/29/2008 13:19:52, Serialize complete at 09/29/2008 13:19:52 Content-Type: multipart/alternative; boundary="=_alternative 006A317C862574D3_=" X-Virus-Checked: Checked by ClamAV on apache.org --=_alternative 006A317C862574D3_= Content-Type: text/plain; charset="US-ASCII" Hi Folks Before I get started, I just want to state that I've done the due diligence and read Tom White's articles as well as EC2 and S3 pages on the Hadoop Wiki and done some searching on this. Thus far I have successfully got Hadoop running on EC2 with no problems. In my local hadoop 0.18 environment I simply add my AWS keys to the hadoop-ec2-env.sh and kickoff the src/contrib/ec2/bin/hadoop-ec2 launch cluster script and it works great. Now, I'm trying to use the Public Haodop EC2 images to run over S3 instead of HDFS. They are set up to use variables passed in from a parameterized launch for all the config options everything EXCEPT the fs.default.filesystem. So in order to bring a cluster of 20 hadoop instances up that run over S3, I need to mod the config file to point to my S3 bucket for the fs.default.filesystem and keep the rest the same. Thus I need my own image to do this. I am attempting this by using the local src/contrib/ec2/bin/hadoop-ec2 create-image script. I've tried this both on a windows system (cygwin environment) AND on my ubuntu 8 system and with each one it gets all the way to the end and fails as it attempts to save the new image to my bucket and says the bucket does not exist with a Server.NoSuchBucket (404) error. The S3 bucket definitely does exist. I have block data inside of it that are results of my Hadoop Jobs. I can go to a single hadoop image on EC2 that I've launched and manually set up to use S3 and say bin/hadoop dfs -ls / and I can see the contents of my S3 bucket. I can also succesfully use that s3 bucket as an input and output of my jobs for a single EC2 hadoop instance. I've tried creating new buckets using the FireFox S3 Organizer plugin and specifying the scripts to save my new image to those and its still the same error. Any ideas ? Is anyone having similar problems ? Regards Steve Watt --=_alternative 006A317C862574D3_=--