hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Larry McCay <lmc...@hortonworks.com>
Subject Re: Securing secrets for S3 FileSystems in DistCp
Date Tue, 03 May 2016 13:09:09 GMT
Hi Elliot -

You may find the following patch interesting: https://issues.apache.org/jira/browse/HADOOP-12548

This enables the use of the Credential Provider API to protect secrets for the s3a filesystem.
The design document attached to it describes how to use it.

If you are not using s3a, there is similar support for the credential provider API in s3 and
s3n but there slight differences in the processing.
S3a is considered the strategic filesystem for accessing s3 - as far as I can tell.

Hope this is helpful.


On May 3, 2016, at 8:41 AM, Elliot West <teabot@gmail.com<mailto:teabot@gmail.com>>


We're currently using DistCp and S3 FileSystems to move data from a vanilla Apache Hadoop
cluster to S3. We've been concerned about exposing our AWS secrets on our shared, on-premise
cluster. As  a work-around we've patched DistCp to load these secrets from a JCEKS keystore.
This seems to work quite well, however we're not comfortable on relying on a DistCp fork.

What is the usual approach to achieve this with DistCp and is there a feature or practice
that we've overlooked? If not, might there be value in us raising a JIRA ticket and submitting
a patch for DistCp to include this secure keystore functionality?

Thanks - Elliot.

View raw message