hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lmc...@apache.org
Subject hadoop git commit: HADOOP-13001 - Clearly Document the Password Details for Keystore-based Credential Providers
Date Fri, 22 Apr 2016 00:47:31 GMT
Repository: hadoop
Updated Branches:
  refs/heads/branch-2.8 60c68c65c -> 497446a3c


HADOOP-13001 - Clearly Document the Password Details for Keystore-based Credential Providers


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/497446a3
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/497446a3
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/497446a3

Branch: refs/heads/branch-2.8
Commit: 497446a3c7d23ffd9d56fb459ef72fc4f1219632
Parents: 60c68c6
Author: Larry McCay <lmccay@hortonworks.com>
Authored: Thu Apr 21 20:42:05 2016 -0400
Committer: Larry McCay <lmccay@hortonworks.com>
Committed: Thu Apr 21 20:47:07 2016 -0400

----------------------------------------------------------------------
 .../src/site/markdown/CredentialProviderAPI.md  | 35 ++++++++++++++++++++
 1 file changed, 35 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/497446a3/hadoop-common-project/hadoop-common/src/site/markdown/CredentialProviderAPI.md
----------------------------------------------------------------------
diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/CredentialProviderAPI.md
b/hadoop-common-project/hadoop-common/src/site/markdown/CredentialProviderAPI.md
index ad5a2e3..d6e4ee7 100644
--- a/hadoop-common-project/hadoop-common/src/site/markdown/CredentialProviderAPI.md
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/CredentialProviderAPI.md
@@ -122,3 +122,38 @@ Utilizing the credential command will often be for provisioning a password
or se
 Example: `hadoop credential create ssl.server.keystore.password jceks://file/tmp/test.jceks`
 
 In order to indicate a particular provider type and location, the user must provide the `hadoop.security.credential.provider.path`
configuration element in core-site.xml or use the command line option `-provider` on each
of the credential management commands. This provider path is a comma-separated list of URLs
that indicates the type and location of a list of providers that should be consulted. For
example, the following path: `user:///,jceks://file/tmp/test.jceks,jceks://hdfs@nn1.example.com/my/path/test.jceks`
indicates that the current user's credentials file should be consulted through the User Provider,
that the local file located at `/tmp/test.jceks` is a Java Keystore Provider and that the
file located within HDFS at `nn1.example.com/my/path/test.jceks` is also a store for a Java
Keystore Provider.
+
+#### Provider Types
+
+1. The `UserProvider`, which is representd by the provider URI `user:///`, is used to retrieve
credentials from a user's Credentials file. This file is used to store various tokens, secrets
and passwords that are needed by executing jobs and applications.
+2. The `JavaKeyStoreProvider`, which is represented by the provider URI `jceks://file|hdfs/path-to-keystore`,
is used to retrieve credentials from a Java keystore. The underlying use of the Hadoop filesystem
abstraction allows credentials to be stored on the local filesystem or within HDFS.
+3. The `LocalJavaKeyStoreProvider`, which is represented by the provider URI `localjceks://file/path-to-keystore`,
is used to access credentials from a Java keystore that is must be stored on the local filesystem.
This is needed for credentials that would result in a recursive dependency on accessing HDFS.
Anytime that your credential is required to gain access to HDFS we can't depend on getting
a credential out of HDFS to do so.
+
+#### Keystore Passwords
+
+Keystores in Java are generally protected by passwords. The primary method of protection
of the keystore-based credential providers are OS level file permissions and any other policy
based access protection that may exist for the target filesystem. While the password is not
a primary source of protection, it is very important to understand the mechanics required
and options available for managing these passwords. It is also very important to understand
all the parties that will need access to the password used to protect the keystores in order
to consume them at runtime.
+
+##### Options
+| Option | Description | Notes |
+|:---- |:---- |:---|
+|Default password |This is a harcoded password of "none". |This is a hardcoded password in
an open source project and as such has obvious disadvantages. However, the mechanics section
will show that it is simpler and consequently nearly as secure as the other more complex options.|
+|Environment variable|`HADOOP_CREDSTORE_PASSWORD`|This option uses an environment variable
to communicate the password that should be used when interrogating all of the keystores that
are configured in the `hadoop.security.credential.provider.path` configuration property. All
of the keystore based providers in the path will need to be protected by the same password.|
+|Password-file|`hadoop.security.credstore.java-keystore-provider.password-file`|This option
uses a "side file" that has its location configured in the `hadoop.security.credstore.java-keystore-provider.password-file`
configuration property to communicate the password that should be used when interrogating
all of the keystores that are configured in the `hadoop.security.credential.provider.path`
configuration property.|
+
+##### Mechanics
+Extremely important to consider that *all* of the runtime consumers of the credential being
protected (mapreduce jobs/applications) will need to have access to the password used to protect
the keystore providers. Communicating this password can be done a number of ways and they
are described in the Options section above.
+
+|Keystore Password| Description|Sync Required|Clear Text|File Permissions|
+|:---- |:---- |:---|:---|:---|
+|Default Password|Hardcoded password is the default. Essentially, when using the default
password for all keystore-based credential stores, we are leveraging the file permissions
to protect the credential store and the keystore password is just a formality of persisting
the keystore.|No|Yes|No (documented)|
+|Environment Variable|`HADOOP_CREDSTORE_PASSWORD` Environment variable must be set to the
custom password for all keystores that may be configured in the provider path of any process
that needs to access credentials from a keystore-based credential provider. There is only
one env variable for the entire path of comma separated providers. It is difficult to know
the passwords required for each keystore and it is suggested that the same be used for all
keystore-based credential providers to avoid this issue. Setting the environment variable
will likely require it to be set from a script or some other clear text storage mechanism.
Environment variables for running processes are available from various unix commands.|Yes|Yes|No|
+|Password File|`hadoop.security.credstore.java-keystore-provider.password-file` configuration
property must be set to the location of the "side file" that contains the custom password
for all keystores that may be configured in the provider path. Any process that needs to access
credentials from a keystore-based credential provider will need to have this configuration
property set to the appropriate file location. There is only one password-file for the entire
path of comma separated providers. It is difficult to know the passwords required for each
keystore and it is therefore suggested that the same be used for all keystore-based credential
providers to avoid this issue. Password-files are additional files that need to be managed,
store the password in clear text and need file permissions to be set such that only those
that need access to them have it. If file permissions are set inappropriately the password
to access the keystores is available in clear text.|Yes|Yes|Yes|
+
+The use of the default password means that no additional communication/synchronization to
runtime consumers needs to be done. The default password is known but file permissions are
the primary protection of the keystore.
+
+When file permissions are thwarted, unlike "side files", there are no standard tools that
can expose the protected credentials - even with the password known. Keytool requires a password
that is six characters or more and doesn't know how to retrieve general secrets from a keystore.
It is also limited to PKI keypairs. Editors will not review the secrets stored within the
keystore, nor will `cat`, `more` or any other standard tools. This is why the keystore providers
are better than "side file" storage of credentials.
+
+That said, it is trivial for someone to write code to access the credentials stored within
a keystore-based credential provider using the API. Again, when using the default password,
the password is merely a formality of persisting the keystore. The *only* protection is file
 permissions and OS level access policy.
+
+Users may decide to use a password "side file" to store the password for the keystores themselves
and this is supported. It is just really important to be aware of the mechanics required for
this level of correctness.
+


Mime
View raw message