hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Remus Rusanu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2198) Remove the need to run NodeManager as privileged account for Windows Secure Container Executor
Date Fri, 10 Oct 2014 12:31:35 GMT

    [ https://issues.apache.org/jira/browse/YARN-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14166718#comment-14166718
] 

Remus Rusanu commented on YARN-2198:
------------------------------------

> libwinutils.c CreateLogonForUser - confusing name, makes me think a new- > CreateLogonTokenForUser?

RR: Fixed

> TestWinUtils - can we add testing specific to security?
RR: Tracked by YARN-2636

> ContainerLaunch launchContainer - nit, why "userName" here, it's user everywhere else
RR fixed

> getLocalWrapperScriptBuilder - why not an override instead of conditional (see below
wrt WindowsContainerExecutor)
>WindowsSecureContainerExecutor - I really think there should be a "WindowsContainerExecutor"
RR: I left both as is, predates the WSCE

> it looks like this is only a 64 bit build now, where it used to be 64 and 32. I assume
this is intentional and ok?
RR: correct. x86 was not possible to build from mvn, and was not required.

> It would be really nice if we could start to separate out some of this new functionality
from winutils, e.g., make the elevated service functionality independent. I see that there
is a jira for doing so down the road, which is good... it looks like the elevated privilages
are just around creating local directories and (obviously) spawning the process. If a stand-alone
service just created and set permissions on those directories, and the java code simply checked
for their existance and then moved on if they were present, I think that a lot of the back-and-forth
of the elevation could be dropped to just one call to create the base directory and a second
to spawn/hand back the output handles. Is that correct?
RR: I actually intentionally avoided that. The LCE does it, and the result is a lot of duplication
between Java code in Default Container Executor and C code in the native container-executor.
With the WSCE I preferred to keep the logic in Java and use native methods just for primitive
operations.

>service.c
> // We're now transfering ownership of the duplicated handles to the caller
> + // If the RPC call fails after this point the handles are leaked inside the NM process
> this is a little alarming. Doesn't the close() call clean this up, regardless of success/
fail?
RR: I added some more comments to clarify that only a process kill or  hardware error can
fail after this point. An atomic transfer is not possible.

> why is this conditional check different from all the others?
RR: fixed

> nit anonimous sp anonymous
RR: fixed

> just a line added, pls revert
RR: fixed

> ElevatedFileSystem:delete()
> it appears that the tests for existance, etc, are run in a non-elevated way, while the
actions are elevated. Is it possible for permissions to be such that the non-elevated tests
do not see files/directories which are present for permission reasons? should those not be
elevated also?
RR: It is not possible under a correct configured deployment. Explicit overwriting  permissions
can deny this, but that will always be possible (eg. deny permission explicitly to LocalSystem).


> streamReaderThread.run - using the readLine() instead of following the simple buffer
copy idiom in ShellCommandExecutor has some efficiency issues, granted it looks to be reading
"memory sized" data so it may be no big deal, but it would be nice to follow the buffer-copy
pattern instead
RR: I forgot to address this. Todo.

> ContainerExecutor comment on comment:
RR: Fixed

> ContainerLaunch public void sanitizeEnv(...)
RR: This predates WSCE, I left it as is

> ContainerLocalizer LOG.info(String.format("nRet: %d", nRet)); - not sure this should
be "info" level
RR: todo, forgot to address it

> getContainerClasspathJarPrivateDir not used in ContainerExecutor.java, we can remove
that.
RR: fixed

> Unnecessary format change only in YarnConfiguration.we can revert
RR: fixed

> Multiple places exceed 80 column limit code convention.
RR: I think I fixed all new Java code

> DefaultContainerExecutor#buildCommandExecutor
RR: Fixed







> Remove the need to run NodeManager as privileged account for Windows Secure Container
Executor
> ----------------------------------------------------------------------------------------------
>
>                 Key: YARN-2198
>                 URL: https://issues.apache.org/jira/browse/YARN-2198
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Remus Rusanu
>            Assignee: Remus Rusanu
>              Labels: security, windows
>         Attachments: .YARN-2198.delta.10.patch, YARN-2198.1.patch, YARN-2198.11.patch,
YARN-2198.2.patch, YARN-2198.3.patch, YARN-2198.delta.4.patch, YARN-2198.delta.5.patch, YARN-2198.delta.6.patch,
YARN-2198.delta.7.patch, YARN-2198.separation.patch, YARN-2198.trunk.10.patch, YARN-2198.trunk.4.patch,
YARN-2198.trunk.5.patch, YARN-2198.trunk.6.patch, YARN-2198.trunk.8.patch, YARN-2198.trunk.9.patch
>
>
> YARN-1972 introduces a Secure Windows Container Executor. However this executor requires
the process launching the container to be LocalSystem or a member of the a local Administrators
group. Since the process in question is the NodeManager, the requirement translates to the
entire NM to run as a privileged account, a very large surface area to review and protect.
> This proposal is to move the privileged operations into a dedicated NT service. The NM
can run as a low privilege account and communicate with the privileged NT service when it
needs to launch a container. This would reduce the surface exposed to the high privileges.

> There has to exist a secure, authenticated and authorized channel of communication between
the NM and the privileged NT service. Possible alternatives are a new TCP endpoint, Java RPC
etc. My proposal though would be to use Windows LPC (Local Procedure Calls), which is a Windows
platform specific inter-process communication channel that satisfies all requirements and
is easy to deploy. The privileged NT service would register and listen on an LPC port (NtCreatePort,
NtListenPort). The NM would use JNI to interop with libwinutils which would host the LPC client
code. The client would connect to the LPC port (NtConnectPort) and send a message requesting
a container launch (NtRequestWaitReplyPort). LPC provides authentication and the privileged
NT service can use authorization API (AuthZ) to validate the caller.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message