hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle
Date Thu, 19 Jul 2012 23:47:35 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418810#comment-13418810
] 

Todd Lipcon commented on MAPREDUCE-4417:
----------------------------------------

bq. The javadoc style for 'Returns BLAH' and then '@return BLAH' is Sun javadoc sytle.
Ew. That's disgusting. Oh well.

bq. the ReloadingX509TrustManager will work with an empty keystore if the keystore file is
not avail at initialization time, and if the keystore file becomes available later one, it
will be loaded. WARNs are logged while the file is not present, so it won't go unnoticed.

WARNs in the logs are often not noticed. Don't you think it's simpler to just fail if the
conf is not present? If someone configures this and doesn't create the file (or the file is
unreadable due to a permissions error), I think it's friendlier to fail fast. Otherwise they'll
just end up seeing strange downstream issues like client certs not being properly trusted,
which will be more difficult to root-cause back to the trust store configuration without log
spelunking.

bq. If reload() fails to reload the new keystore, it assumes there are not certs and runs
empty until the next reload attempt. Seems a safer assumption that continuing running with
obsolete keys.

My worry here is that people might be using a conf management system to push out the key store
files. If the reload happens to trigger right in the middle of a conf mgmt update, and the
update is non-atomic, it will see an invalid keystore. I wouldn't want the TT to revert to
an empty key store until the next reload interval in that case.

bq. While hadoop.ssl.enabled only applies to shuffle, the intention is to use it for the rest
of the HTTP endpoints. Thus, a single know would enable SSL. That is why the name of the property
and its location (in core-default.xml)

Given it doesn't currently affect the other HTTP endpoints, I find this very confusing. Why
not make a separate config for now, and then once it affects more than just the shuffle, you
can change the default for {{mapred.shuffle.use.ssl}} to {{${hadoop.use.ssl}}} to pick up
the system-wide default.

bq. In the TestSSLFactory, the Assert.fail() statements, are sections the test should not
make it; they are used for negative tests.
I get that. But, if the test breaks, you'll end up with a meaningless failure, instead of
a message explaining why it failed. If you let the exception fall through, then the failed
unit test would actually have a stack trace that explains why it failed, which aids in debugging.

bq. Client certs are disabled by default. If they are per job, yes they could be shipped via
DC. This would require a alternate implementation of the KeyStoresFactory, thus the mechanism
is already in place.
Does it need an alternate implementation? The distributed cache files can be put on the classpath
already, in which case the existing keystore-loading code should be able to find them. The
only change would be in the documentation -- explaining that the client should ship the files
via distributed cache rather than putting them in HADOOP_CONF_DIR. Why wouldn't that be enough?
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch,
MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch,
MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication
for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job
performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message