hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Sautins (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9293) For S3 use credentials file
Date Thu, 21 Feb 2013 21:36:12 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583569#comment-13583569
] 

Andy Sautins commented on HADOOP-9293:
--------------------------------------


  I would be interested to get the perspective of someone who uses EMR/AWS.  In my opinion
it is a very EMR/AWS specific use case that I'm trying to address, but I agree that my initial
stab probably probably isn't the best approach.

  I still am uncomfortable with your suggestions for scripts to extract credentials.  At the
end of the day I want the ( probably ) already existing credentials file to be the system
of record for client side credentials.  To me it just doesn't make sense to have to place
the credentials in the hadoop configuration files ( either directly or through some script
manipulation ) if they are already available in another location. 

  I find the SOCKS proxy implementation to be very interesting for this situation.  It is
not only very similar to what I'm trying to achieve, but would most likely be used in conjunction
with the S3Credentials mechanism I am proposing.  If you look at how one might use SOCKS you
would do the following:

  On the client machine in core-site.xml

  <property><name>hadoop.rpc.socket.factory.class.default</name><value>org.apache.hadoop.net.SocksSocketFactory</value></property>

  Then on the server nodes you would set the following:

  <property><name>hadoop.rpc.socket.factory.class.default</name><value>org.apache.hadoop.net.StandardSocketFactory</value><final>true</final></property>

  That uses the SOCKS proxy factory on the client machine only.  I uploaded another patch
that takes an approach very similar to the SOCKS proxy configuration.  With this approach
I would set the following

  On the client machine in core-site.xml

  <property><name>fs.s3.credentials.class</name><value>org.apache.hadoop.fs.s3.S3CredentialsFromFile</value></property>
  <property><name>fs.s3.credentials.file</name><value>/path/to/credentials.json</value></property>

  On the server

  <property><name>fs.s3.credentials.class</name><value>org.apache.hadoop.fs.s3.S3Credentials</value><final>true</final></property>

 That mimics what is done with the SOCKS proxy reasonably nicely I think and allows for specialized
S3Credentials behavior.  

 Note if you still don't like it I'm happy to look to add this to contrib or just close out
the JIRA.  This is functionality we are using and I believe others may find value in it as
well.

 If this seems like a reasonable approach I'll address your above concerns around documentation
and tests next.



                
> For S3 use credentials file
> ---------------------------
>
>                 Key: HADOOP-9293
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9293
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 1.0.2
>         Environment: Linux
>            Reporter: Andy Sautins
>            Priority: Minor
>              Labels: features, newbie
>         Attachments: HADOOP-9293.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The following document describes the current way that S3 credentials can be specified
( http://wiki.apache.org/hadoop/AmazonS3 ).  In summary they are:
>   * in the S3 URI.
>   * in the hadoop-site.xml file as 
>   ** fs.s3.awsAccessKeyId
>   ** fs.s3.awsSecretAccessKey 
>   ** fs.s3n.awsAccessKeyId
>   ** fs.s3n.aswSecretAccessKey
> The amazon EMR tool elastic-mapreduce already provide the ability to use a credentials
file ( see http://s3.amazonaws.com/awsdocs/ElasticMapReduce/latest/emr-qrc.pdf ).  
> I would propose that we allow roughly the same access to credentials through a credentials
file that is currently provided by elastic-mapreduce.  This should allow for centralized administration
of credentials which should be positive for security.
> I propose the following properties:
> {quote}
>    <property><name>f3.s3.awsCredentialsFile</name><value>/path/to/file</value></property>
>    <property><name>fs.s3n.awsCredentialsFile</name><value>/path/to/file</value></property>
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message