hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom White <...@cloudera.com>
Subject Re: How do I reference S3 from an EC2 Hadoop cluster?
Date Wed, 25 Nov 2009 05:20:16 GMT

If the data was transferred to S3 outside of Hadoop then you should
use the s3n filesystem scheme (see the explanation on
http://wiki.apache.org/hadoop/AmazonS3 for the differences between the
Hadoop S3 filesystems).

Also, some people have had problems embedding the secret key in the
URI, so you can set it in the configuration as follows:



Then use a URI of the form s3n://<BUCKET>/path/to/logs


On Tue, Nov 24, 2009 at 5:47 PM, Mark Kerzner <markkerzner@gmail.com> wrote:
> Hi,
> I need to copy data from S3 to HDFS. This instruction
> bin/hadoop distcp s3://<ID>:<SECRET>@<BUCKET>/path/to/logs logs
> does not seem to work.
> Thank you.

View raw message