hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From slitz <slitzferr...@gmail.com>
Subject Using S3 Block FileSystem as HDFS replacement
Date Tue, 01 Jul 2008 03:04:46 GMT
I've been trying to setup hadoop to use s3 as filesystem, i read in the wiki
that it's possible to choose either S3 native FileSystem or S3 Block
Filesystem. I would like to use S3 Block FileSystem to avoid the task of
"manually" transferring data from S3 to HDFS every time i want to run a job.

I'm still experimenting with EC2 contrib scripts and those seem to be
What i can't understand is how may be possible to use S3 using a public
hadoop AMI since from my understanding hadoop-site.xml gets written on each
instance startup with the options on hadoop-init, and it seems that the
public AMI (at least the 0.17.0 one) is not configured to use S3 at
all(which makes sense because the bucket would need individual configuration

So... to use S3 block FileSystem with EC2 i need to create a custom AMI with
a modified hadoop-init script right? or am I completely confused?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message