hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: Hadoop Common: Why not re-use the Security model offered by SELINUX?
Date Sat, 28 Mar 2015 13:06:32 GMT

SELinux does nothing for Hadoop cluster security at the data-layer, which is why there tools
on top, not only to lock down systems, but to provide better data governance: where did things
come from, has it been tainted by merging with sensitive data, etc, etc.

Where it could be good is

1. Allow hadoop nodes to be more secure on the intranet itself. It's another layer in the
defense-in-depth story, so if some standard Linux service on the system (ssh, ntpd, ...) gets
compromised, the damage is partially limited. My home server is SELinux-enforced, for example

2. Reduce the impact of anything malicious trying to run as a YARN-scheduled app.

#2 is moot until you have Kerberos up; until then the whole of HDFS is visible. Once you have
it up SE linux could restrict what damage a privilege-esclated YARN job could do to the local
hosts. But I'm still reasonably confident that given the ability to run 200+ containers on
a Hadoop cluster for a few hours I could (a) portscan an intranet for SMB & sharepoint
hosts, and (b) execute enough TCP open connections to overload the services. 

I'm +1 to getting Hadoop to run on SELinux; I think mainly we've been lazy. 

But it's not going to keep your Hadoop-stored data safe, lock-down your network apps or help
mitigate the intentional or unintentional damage that hadoop code can do if on the same intranet
as the rest of your organisation. Or, as AW on Nicholas can attest, the damage you can do
from running network traffic- or CPU-intensive code from taking down the network or power
supplies of the rest of the datacentre. 

> On 28 Mar 2015, at 02:33, jay vyas <jayunit100.apache@gmail.com> wrote:
> Tools like freeipa and so on are very synergistic first steps down the road
> of making hadoop more enterprise friendly.  For example, if you let freeipa
> manage users, kerberos and so on - then you can pave the way down the road
> for selinux as well (since these tools are able to work together).
> I think in general, the more hadoop works with the linux community , rather
> than rebuilding its own solutions, the easier it will be to integrate in
> broader and broader deployments - so in theory working to run  selinux and
> hadoop together is probably a win-win.
> On Thu, Mar 26, 2015 at 1:22 PM, Aaron T. Myers <atm@cloudera.com> wrote:
>> In addition to everything Allen has already said, which I entirely agree
>> with, I'll also point out that much of the focus on Hadoop security has
>> been related to authentication, and only somewhat more recently on
>> providing advanced authorization capabilities. I'll readily admit to not
>> knowing much about SE Linux's capabilities, but my impression is that it
>> wouldn't do much to be able to help out with authentication within Hadoop,
>> and hence wouldn't have been a realistic option when Hadoop's security work
>> was started many years ago.
>> --

View raw message