hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Remus Rusanu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2198) Remove the need to run NodeManager as privileged account for Windows Secure Container Executor
Date Tue, 08 Jul 2014 13:01:06 GMT

    [ https://issues.apache.org/jira/browse/YARN-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054909#comment-14054909

Remus Rusanu commented on YARN-2198:

I have uploaded a first patch so we can start the review discussion. Here is a summary of

 - `winutils service` is a new winutils CLI option to that causes winutils to attach to SCM
(ie. start as a service) and open an LPC endpoint. This service is required to run with elevated
privileges (LocalSystem)
 - an LPC protocol is declared:
interface Hdpwinutilsvc
	typedef struct {
		[string] const wchar_t* cwd;
		[string] const wchar_t* jobName;
		[string] const wchar_t* user;
		[string] const wchar_t* pidFile;
		[string] const wchar_t* cmdLine;

	typedef struct {
		LONG_PTR hProcess;
		LONG_PTR hThread;
		LONG_PTR hStdIn;
		LONG_PTR hStdOut;
		LONG_PTR hStdErr;

	 error_status_t WinutilsCreateProcessAsUser(
	 	[in] int nmPid,
	 	[in] CREATE_PROCESS_REQUEST *request,
	 	[out] CREATE_PROCESS_RESPONSE **response);
 - hadoop.dll JNI is extended via NativeIO.createTaskAsUser to use the LPC mechanism to ask
the winutils service to start the containers (and the localizer too)
 - The winutils service does not do any S4U impersonation work. It simply spwans winutils
again, with the appropriate command line for S4U (ie. YARN-1063). The process is created suspended,
the process handle, the main thread handle and stdin/stdout/stderr handles are duplicated
in NM. The LPC call response (out) structure contains all these handles.
 - NM takes ownership of the spawned process, creates Java Input/Output stream around the
stdin/stdout/stderr and then resumes the process. The resumed process does the S4U work, spawns
the secure container process and waits for the container execution to finish (ditto for localization).
 - The NM uses org.apache.hadoop.io.nativeio.NativeIO.WinutilsProcessStub to control the process
spwaned by the wintuils. This class uses several JNI methods to control this process.

Access check

1. Service Access check. The winutils service authorizes the caller for permission to use
the elevated service create process feature. Access check is performed in the RPC authorization
context using AuthZ and the RPC client context. Authorization is checked against an ah-hoc
security descriptor that describes the configurable 'allowed' users. Normally this should
contain the NM (or the YARN group perhaps).

2. The impersonation access check. winutils task createAsUser perfoms the access check on
the user being impersonated against the configurable 'allowed' and 'denied' lists. The check
is done using AuthZ using an authz context derived from the user logon token (LsaLogonUser
token, see YARN-1063)  against an ad-hoc security descriptor that describes the two configurable
lists ('allowed' and 'denied'). Note that the access check is not done at the winutils service
LPC call layer, but at the S4U layer. This way the winutils tool cannot be uses outside the
service call context to bypass the check. True that the check is preventing something that
the caller, in that context (ie. not when using the winutils service) is allowed to do, so
the caller could use any other tool of choice (PoSh scripts) to do the same. A second reason
to do it at this layer (some would say the true reason...)  is that this layer has the proper
infrastructure to do the check (the logon handle). Had the check be done at the winutils LPC
service layer that could would have to also obtain the logon token just to do the check. Doing
the check at the S4U layer is both simpler and more intuitive for an admin user.

What's not present in this 1.patch:

 - the access check configuration is based on settings in yarn-site.xml. Will need to be moved
to a separate config file (TBD if xml or not).
 - The NM handling of the spawned process (parse the output, wait for completion, handle timeout
if any) is a duplicate of ShellProcess.ShellCommandExecutor. I tried to refactor the later
to handle an injected Process rather than the one it spawns itself, but it ripples over.
 - code needs cleanup, it shows the signs of the mighty struggle it took to get it to work.

> Remove the need to run NodeManager as privileged account for Windows Secure Container
> ----------------------------------------------------------------------------------------------
>                 Key: YARN-2198
>                 URL: https://issues.apache.org/jira/browse/YARN-2198
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Remus Rusanu
>            Assignee: Remus Rusanu
>              Labels: security, windows
>         Attachments: YARN-2198.1.patch
> YARN-1972 introduces a Secure Windows Container Executor. However this executor requires
a the process launching the container to be LocalSystem or a member of the a local Administrators
group. Since the process in question is the NodeManager, the requirement translates to the
entire NM to run as a privileged account, a very large surface area to review and protect.
> This proposal is to move the privileged operations into a dedicated NT service. The NM
can run as a low privilege account and communicate with the privileged NT service when it
needs to launch a container. This would reduce the surface exposed to the high privileges.

> There has to exist a secure, authenticated and authorized channel of communication between
the NM and the privileged NT service. Possible alternatives are a new TCP endpoint, Java RPC
etc. My proposal though would be to use Windows LPC (Local Procedure Calls), which is a Windows
platform specific inter-process communication channel that satisfies all requirements and
is easy to deploy. The privileged NT service would register and listen on an LPC port (NtCreatePort,
NtListenPort). The NM would use JNI to interop with libwinutils which would host the LPC client
code. The client would connect to the LPC port (NtConnectPort) and send a message requesting
a container launch (NtRequestWaitReplyPort). LPC provides authentication and the privileged
NT service can use authorization API (AuthZ) to validate the caller.

This message was sent by Atlassian JIRA

View raw message