hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jimmy Lin" <jimmy...@umd.edu>
Subject slash in AWS Secret Key, WAS Re: Namenode Exceptions with S3
Date Wed, 09 Jul 2008 19:44:51 GMT
I've come across this problem before.  My simple solution was to
regenerate new keys until I got one without a slash... ;)

-Jimmy

> I have Hadoop 0.17.1 and an AWS Secret Key that contains a slash ('/').
>
> With distcp, I found that using the URL format s3://ID:SECRET@BUCKET/
> did not work, even if I encoded the slash as "%2F".  I got
> "org.jets3t.service.S3ServiceException: S3 HEAD request failed.
> ResponseCode=403, ResponseMessage=Forbidden"
>
> When I put the AWS Secret Key in hadoop-site.xml and wrote the URL as
> s3://BUCKET/ it worked.
>
> I have periods ('.') in my bucket name, that was not a problem.
>
> What's weird is that org.apache.hadoop.fs.s3.Jets3tFileSystemStore
> uses java.net.URI, which should take take of unencoding the %2F.
>
> -Stuart
>
>
> On Wed, Jul 9, 2008 at 1:41 PM, Lincoln Ritter
> <lincoln@lincolnritter.com> wrote:
>> So far, I've had no luck.
>>
>> Can anyone out there clarify the permissible characters/format for aws
>> keys and bucket names?
>>
>> I haven't looked at the code here, but it seems strange to me that the
>> same restrictions on host/port etc apply given that it's a totally
>> different system.  I'd love to see exceptions thrown that are
>> particular to the protocol/subsystem being employed.  The s3 'handler'
>> (or whatever, might be nice enough to check for format violations and
>> throw and appropriate exception, for instance.  It might URL-encode
>> the secret key so that the user doesn't have to worry about this, or
>> throw an exception notifying the user of a bad format.  Currently,
>> apparent problems with my s3 settings are throwing exceptions that
>> give no indication that the problem is actually with those settings.
>>
>> My mitigating strategy has been to change my configuration to use
>> "instance-local" storage (/mnt).  I then copy the results out to s3
>> using 'distcp'.  This is odd since distcp seems ok with my s3/aws
>> info.
>>
>> I'm still unclear as to the permissible characters in bucket names and
>> access keys.  I gather '/' is bad in the secret key and that '_' is
>> bad for bucket names.  Thusfar i have only been able to get buckets to
>> work in distcp that have only letters in their names, but I haven't
>> tested to extensively.
>>
>> For example, I'd love to use buckets like:
>> 'com.organization.hdfs.purpose'.  This seems to fail.  Using
>> 'comorganizationhdfspurpose' works but clearly that is less than
>> optimal.
>>
>> Like I say, I haven't dug into the source yet, but it is curious that
>> distcp seems to work (at least where s3 is the destination) and hadoop
>> fails when s3 is used as its storage.
>>
>> Anyone who has dealt with these issues, please post!  It will help
>> make the project better.
>>
>> -lincoln
>>
>> --
>> lincolnritter.com
>>
>>
>>
>> On Wed, Jul 9, 2008 at 7:10 AM, slitz <slitzferrari@gmail.com> wrote:
>>> I'm having the exact same problem, any tip?
>>>
>>> slitz
>>>
>>> On Wed, Jul 2, 2008 at 12:34 AM, Lincoln Ritter
>>> <lincoln@lincolnritter.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I am trying to use S3 with Hadoop 0.17.0 on EC2.  Using this style of
>>>> configuration:
>>>>
>>>> <property>
>>>>  <name>fs.default.name</name>
>>>>  <value>s3://$HDFS_BUCKET</value>
>>>> </property>
>>>>
>>>> <property>
>>>>  <name>fs.s3.awsAccessKeyId</name>
>>>>  <value>$AWS_ACCESS_KEY_ID</value>
>>>> </property>
>>>>
>>>> <property>
>>>>  <name>fs.s3.awsSecretAccessKey</name>
>>>>  <value>$AWS_SECRET_ACCESS_KEY</value>
>>>> </property>
>>>>
>>>> on startup of the cluster with the bucket having no non-alphabetic
>>>> characters, I get:
>>>>
>>>> 2008-07-01 16:10:49,171 ERROR org.apache.hadoop.dfs.NameNode:
>>>> java.lang.RuntimeException: Not a host:port pair: XXXXX
>>>>        at
>>>> org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:121)
>>>>        at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:121)
>>>>        at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:178)
>>>>        at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:164)
>>>>        at
>>>> org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:848)
>>>>        at org.apache.hadoop.dfs.NameNode.main(NameNode.java:857)
>>>>
>>>> If I use this style of configuration:
>>>>
>>>> <property>
>>>>  <name>fs.default.name</name>
>>>>  <value>s3://$AWS_ACCESS_KEY:$AWS_SECRET_ACCESS_KEY@$HDFS_BUCKET</value>
>>>> </property>
>>>>
>>>> I get (where the all-caps portions are the actual values...):
>>>>
>>>> 2008-07-01 19:05:17,540 ERROR org.apache.hadoop.dfs.NameNode:
>>>> java.lang.NumberFormatException: For input string:
>>>> "AWS_SECRET_ACCESS_KEY@HDFS_BUCKET"
>>>>        at
>>>> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
>>>>        at java.lang.Integer.parseInt(Integer.java:447)
>>>>        at java.lang.Integer.parseInt(Integer.java:497)
>>>>        at
>>>> org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:128)
>>>>        at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:121)
>>>>        at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:178)
>>>>        at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:164)
>>>>        at
>>>> org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:848)
>>>>        at org.apache.hadoop.dfs.NameNode.main(NameNode.java:857)
>>>>
>>>> These exceptions are taken from the namenode log.  The datanode logs
>>>> show the same exceptions.
>>>>
>>>> Other than the above configuration changes, the configuration is
>>>> identical to that generate by the hadoop image creation script found
>>>> in the 0.17.0 distribution.
>>>>
>>>> Can anybody point me in the right direction here?
>>>>
>>>> -lincoln
>>>>
>>>> --
>>>> lincolnritter.com
>>>>
>>>
>>
>
>



Mime
View raw message