hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rahul Patodi <patodirahul.had...@gmail.com>
Subject Re: problem using s3 instead of hdfs
Date Tue, 16 Oct 2012 10:23:30 GMT
I think these blog posts will answer your question:

http://www.technology-mania.com/2012/05/s3-instead-of-hdfs-with-hadoop_05.html
http://www.technology-mania.com/2011/05/s3-as-input-or-output-for-hadoop-mr.html


On Tue, Oct 16, 2012 at 1:30 PM, sudha sadhasivam <sudhasadhasivam@yahoo.com
> wrote:

> Is there a time dealy to fetch information from S3 to hadoop cluster when
> compared to a regular hadoop cluster setup. Can an elastic block storage be
> used for this purpose?
> G Sudha
>
> --- On *Tue, 10/16/12, Hemanth Yamijala <yhemanth@thoughtworks.com>*wrote:
>
>
> From: Hemanth Yamijala <yhemanth@thoughtworks.com>
> Subject: Re: problem using s3 instead of hdfs
> To: user@hadoop.apache.org
> Date: Tuesday, October 16, 2012, 12:41 PM
>
>
> Hi,
>
> I've not tried this on S3. However, the directory mentioned in the
> exception is based on the value of this particular configuration
> key: mapreduce.jobtracker.staging.root.dir. This defaults
> to ${hadoop.tmp.dir}/mapred/staging. Can you please set this to an S3
> location and try ?
>
> Thanks
> Hemanth
>
> On Mon, Oct 15, 2012 at 10:43 PM, Parth Savani <parth@sensenetworks.com<http://mc/compose?to=parth@sensenetworks.com>
> > wrote:
>
> Hello,
>       I am trying to run hadoop on s3 using distributed mode. However I am
> having issues running my job successfully on it. I get the following error
> I followed the instructions provided in this article ->
> http://wiki.apache.org/hadoop/AmazonS3
> I replaced the fs.default.name value in my hdfs-site.xml to
> s3n://ID:SECRET@BUCKET
> And I am running my job using the following: hadoop jar
> /path/to/my/jar/abcd.jar /input /output
> Where */input* is the folder name inside the s3 bucket
> (s3n://ID:SECRET@BUCKET/input)
> and */output *folder should created in my bucket (s3n://ID:SECRET@BUCKET
> /output)
> Below is the error i get. It is looking for job.jar on s3 and that path is
> on my server from where i am launching my job.
>
> java.io.FileNotFoundException: No such file or directory
> '/opt/data/hadoop/hadoop-mapred/mapred/staging/psavani/.staging/job_201207021606_1036/job.jar'
> at
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:412)
>  at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:207)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>  at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1371)
> at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1352)
>  at
> org.apache.hadoop.mapred.JobLocalizer.localizeJobJarFile(JobLocalizer.java:273)
> at
> org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:381)
>  at
> org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:371)
> at
> org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:222)
>  at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1372)
> at java.security.AccessController.doPri
>
>
>
>
>
>


-- 
*Regards*,
Rahul Patodi

Mime
View raw message