impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Laszlo Gaal (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Date Wed, 08 Nov 2017 15:07:08 GMT
Hello Lars Volker, Michael Brown, Jim Apple, Philip Zeyliger, Sailesh Mukil, David Knupp, Joe
McDonnell, Alex Behm, 

I'd like you to reexamine a change. Please visit

to look at the new patch set (#2).

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

For some time Impala in a production environment has been able
to access data stored in Amazon S3 buckets using credentials specified
in a number of ways:
- storing Amazon access keys in environment variables or
  in core-site.xml.
- using proprietary management tools to store Amazon access keys
- using Amazon IAM roles bound to VMs running in EC2.

The development minicluster environment used the first approach,
which risked leaking these keys.

This change paves the way for Impala development setups to use IAM
roles to access S3 buckets when running on an Amazon EC2 virtual
machine. The changes mainly ensure that traditional credentials
supplied in environment variables do not conflict with credentials
supplied by the IAM role attached to the VM instance.
The IAM role based credentials are accessible through the EC2
instance-property mechanism; for further details see Amazon's docs at

Changes to the configuration script:
1. bin/ stops setting the AWS_* environment variables
   to dummy default values. When AWS credentials are not supplied in
   the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY,
   these variables are unset (removed from the environment), otherwise
   they would preempt authentication based on the IAM role.
2. Having AWS credentials in the AWS_* environment variables is now
   optional. They are still accepted to allow for private test runs
   accessing private/nondefault buckets with custom credentials.
3. bin/ now checks if credentials are supplied in the
   AWS_* variables or via the IAM role.

Changes to the minicluster configuration:
1. Some front-end tests still refer to the old S3 connector s3n:.
   This connector does not support s3 auth via IAM roles, but this
   is not a problem: these front-end tests don't actually reach out
   to S3, the s3n: notation is just for the FE.
   For these tests to work authentication parameters still need to
   exist for the s3n: connector in core-site.xml, but the values do
   not matter, so the configuration template now has fixed dummy
   values for the s3n: AWS credentials.

2. Remove empty AWS credentials from core-site.xml.tmpl:
   The testdata/cluster/admin setup script substitutes values from
   environment variables into Hadoop *-site.xml configuration files
   when setting up the minicluster runtime environment.

   The configuration section for s3a: credentials are now completely
   removed if:
   - the target filesystem is set to "s3"
   - and the AWS credential environment variables AWS_ACCESS_KEY_ID
     and AWS_SECRET_ACCESS_KEY are both empty or missing.

   The configuration file core-site.xml.tmpl was extended with
   comment markers that delimit the section to be removed in this case.

Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
M bin/
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl
3 files changed, 92 insertions(+), 21 deletions(-)

  git pull ssh:// refs/changes/94/8294/2
To view, visit
To unsubscribe, visit

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: David Knupp <>
Gerrit-Reviewer: Jim Apple <>
Gerrit-Reviewer: Joe McDonnell <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Laszlo Gaal <>
Gerrit-Reviewer: Michael Brown <>
Gerrit-Reviewer: Philip Zeyliger <>
Gerrit-Reviewer: Sailesh Mukil <>

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message