flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Newport, Billy" <Billy.Newp...@gs.com>
Subject RE: Impersonation support in Flink
Date Tue, 24 Oct 2017 14:26:28 GMT
Our scenario is to enable a specific Kerberos to impersonate any Kerberos in a specific group,
this is enabled the in hdfs configuration. That Kerberos does not need to be root, just a
Kerberos allowed to impersonate that users in that group.

We want the job to access HDFS as the impersonated Kerberos, not the one that launched it.
We do this with our MR jobs but simply impersonating in the driver and all the mappers/reduces
run correctly and use the impersonate user active when the job was submitted. We expected
flink to work similarly and found the issue.

We do this without the keytab for that user, if we had it, we wouldn’t need to impersonate
if you see what I mean.

So, what kind of changes would be needed where to implement this function, happy to do the
patch to enable this behavior.

Billy


From: Eron Wright [mailto:eronwright@gmail.com]
Sent: Monday, October 23, 2017 4:53 PM
To: Chan, Regina [Tech]
Cc: user@flink.apache.org
Subject: Re: Impersonation support in Flink

Hello,
Flink does initialize the process-wide login user, using the UGI's Kerberos login method.
 It doesn't support proxy user at the moment.   Let's dig into the scenario a bit to see how
best to support it.

As you know, the proxy user functionality of Hadoop allows a process that has superuser credentials
to impersonate a normal user when making remote calls to HDFS and other remote services. 
  A possible scenario would be, the Flink cluster has a superuser account and accesses HDFS
on behalf of someone.   Keep in mind that job code runs with full trust within the JM/TM,
and would have access to the superuser keytab.   Does that sound like your scenario?

Proxy user support would not facilitate the scenario of running a user's job code such that
the job accesses HDFS as that user.   The only way to support that scenario is by launching
the cluster using that user's keytab.

I hope this helps,
Eron

On Mon, Oct 23, 2017 at 10:52 AM, Chan, Regina <Regina.Chan@gs.com<mailto:Regina.Chan@gs.com>>
wrote:
Hi folks,

Is Flink is able to do impersonation using UserGroupInformation? How do we make all the tasks
run with this in a way that we wouldn’t have to do it per task?


UserGroupInformation ugi = UserGroupInformation.createProxyUser( proxyUser, UserGroupInformation.getLoginUser());
PrivilegedExceptionAction<Void> iAction = new PrivilegedExceptionAction<Void>()
{
public Void run() throws Exception
{
              action.run();
              return null;
       }
};
ugi.doAs(iAction);



Regina Chan
Goldman Sachs – Enterprise Platforms, Data Architecture
30 Hudson Street, 37th floor | Jersey City, NY 07302<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D30-2BHudson-2BStreet-2C-2B37th-2Bfloor-2B-257C-2BJersey-2BCity-2C-2BNY-2B07302-250D-2B-28-25C2-25A0-2B-28212-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=rlkM70D3djmDN7dGPzzbVKG26ShcTFDMKlX5AWucE5Q&m=ZeDE52hr-zVl4Qjl1El1KhVbTkJEdJstVisdyaaqbrs&s=rN_ceG5mzqTClLiso3EBYH1DwUi9Sh_EZyszNwdm_Q4&e=>
•<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D30-2BHudson-2BStreet-2C-2B37th-2Bfloor-2B-257C-2BJersey-2BCity-2C-2BNY-2B07302-250D-2B-28-25C2-25A0-2B-28212-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=rlkM70D3djmDN7dGPzzbVKG26ShcTFDMKlX5AWucE5Q&m=ZeDE52hr-zVl4Qjl1El1KhVbTkJEdJstVisdyaaqbrs&s=rN_ceG5mzqTClLiso3EBYH1DwUi9Sh_EZyszNwdm_Q4&e=>
 (212) 902-5697<tel:(212)%20902-5697>


Mime
View raw message