hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lin Zhao <...@exabeam.com>
Subject Re: Is it possible to turn on data node encryption without kerberos?
Date Wed, 06 Apr 2016 23:37:51 GMT

Thanks a lot for the detailed response. A little background on our deployment. Our HDFS cluster
is a single-tenant  deployed on a Docker cluster running across several hosts. Inside of the
docker containers there's only one root user running everything. And there's no external network
access to the container network. So we are ensurring authentication by controlling access
to the physical boxes. Our major concern is sniffing on the data in transfer.

How does hadoop.rpc.protection work if set to privacy but without kerberos? Is the communication
still encrypted?

From: Chris Nauroth <cnauroth@hortonworks.com<mailto:cnauroth@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<mailto:musty_rehmani@yahoo.com>" <musty_rehmani@yahoo.com<mailto:musty_rehmani@yahoo.com>>,
Lin Zhao <lin@exabeam.com<mailto:lin@exabeam.com>>, "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.
 We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data
transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls
the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.
 This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without
using Kerberos authentication in the cluster, the benefit of that is questionable in a production
deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to
spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit
becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <musty_rehmani@yahoo.com.INVALID<mailto:musty_rehmani@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<mailto:musty_rehmani@yahoo.com>" <musty_rehmani@yahoo.com<mailto:musty_rehmani@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <lin@exabeam.com<mailto:lin@exabeam.com>>, "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
Subject: Re: Is it possible to turn on data node encryption without kerberos?

Kerberos is used to authenticate user or service principal to grant access to cluster. It
doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<lin@exabeam.com<mailto:lin@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml
and core-site xml to the data node and name node and restart both.



When I try to put a file from the hdfs command line shell, the operation fails with "connection
is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake
from client at / Perhaps the client is running an older version of Hadoop
which does not support encryption"

I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication,
but didn't turn on kerberos authentication. No authentication works in my environment. Can
this be the reason the handshake fails?

Any help is appreciated.


Lin Zhao

View raw message