hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Remus Rusanu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2198) Remove the need to run NodeManager as privileged account for Windows Secure Container Executor
Date Mon, 01 Sep 2014 07:58:21 GMT

    [ https://issues.apache.org/jira/browse/YARN-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117143#comment-14117143

Remus Rusanu commented on YARN-2198:

1. nativeio.c: Should we return null here?
RR: Fixed

2.Nit: nativeio code uses different naming convention for local variables. Please try to be
consistent with the rest of the file.
RR: Fixed

3. nativeio.c: Nit: I would move throw_ioe if check before done:, the code flow will be less
error prone 
RR: fixed

4. winutils_process_stub.c: Can {{env->NewGlobalRef())) return null/throw? Should we handle
RR: Fixed

5. winutils_process_stub.c: You should properly handle the GetExitCodeProcess() failure case.
RR: fixed

6. winutils_process_stub.c:Init to INVALID_HANDLE_VALUE?
RR: Fixed

7. client.c: Are RPC_STATUS error codes compatible with winerror codes? (semantic around checking
for error)
RR: From my experiments they are compatible. FormatMessage gets the right message for RPC

8. config.cpp: Wondering if there is a way to get to config files without adding a dependency
on env variables?
RR: config location is now ../etc/hadoop/wsce-site.xml relative to exe. It is defined in pom.xml

9. config.cpp: This error check is unintuitive. Can you please be more explicit?
RR: fixed (no longer applies because only one file is checked)

10. config.cpp: Are SAL annotations correct? For strings one would usually use __out_ecount()?
RR: Fixed, and it was  broken all over, thanks for catching it

11. config.cpp: SAL annotation __out_bcount? Also outLen->len in the annotation.
RR: fixed

11. config.cpp: This should be before StringCbPrintf to guarantee that CoInit and CoUninit
are balanced. 
RR: fixed

12. hdpwinutilsvc.idl: Name does not seem appropriate for apache... possibly name it just
winutilsvc.idl. Should we use spaces in this file for consistency? 
RR: fixes all names as "hadoopwinutilsvc"

13. winutils.h:__in_bcount(len) -> __in_ecount(len)
RR: fixed

14. libwinutils.c: I'm wondering if this is good opportunity to introduce unittests for our
C code, as the complexity started increasing beyond just windows OS calls, where there is
little value in unittesting. 
RR: Not fixed. I will come back later and add units here, but the core work (LRPC, SCM, logon
user and create process) are basically untestable from C unit test.

15. libwinutils.c: Should we deallocate this when BuildSecurityDescriptor fails?
RR: is alloca, so it doesn't need dealloc. 

I don't think it is required to do this now, just wanted to bring it up: if our native codebase
continues to grow at this pace we should consider introducing smart pointers. It is becoming
impossibly hard to properly manage the memory in all success/failure cases. This becomes more
important now that we have long running NM native client and winutils service. 
RR: the whole winutils/libwinutils code style is early 90's Petzold Windows code style. I'm
not a fan of it, but I kept all new code consistent with this style. Moving to C++ RAI would
be better, but I don;t want to do it piecemeal. Some other time.

16. What is the behaviour of calling winutils service. Will this command install and start
a winutils.exe service under SYSTEM account, and exit?
RR: no. SCM instalation/config is left to SCM tools (eg. sc.exe). "winutils service" is the
command line to start the service (it starts, register entry point with SCM, waits for SCM

> Remove the need to run NodeManager as privileged account for Windows Secure Container
> ----------------------------------------------------------------------------------------------
>                 Key: YARN-2198
>                 URL: https://issues.apache.org/jira/browse/YARN-2198
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Remus Rusanu
>            Assignee: Remus Rusanu
>              Labels: security, windows
>         Attachments: YARN-2198.1.patch, YARN-2198.2.patch, YARN-2198.3.patch, YARN-2198.separation.patch
> YARN-1972 introduces a Secure Windows Container Executor. However this executor requires
a the process launching the container to be LocalSystem or a member of the a local Administrators
group. Since the process in question is the NodeManager, the requirement translates to the
entire NM to run as a privileged account, a very large surface area to review and protect.
> This proposal is to move the privileged operations into a dedicated NT service. The NM
can run as a low privilege account and communicate with the privileged NT service when it
needs to launch a container. This would reduce the surface exposed to the high privileges.

> There has to exist a secure, authenticated and authorized channel of communication between
the NM and the privileged NT service. Possible alternatives are a new TCP endpoint, Java RPC
etc. My proposal though would be to use Windows LPC (Local Procedure Calls), which is a Windows
platform specific inter-process communication channel that satisfies all requirements and
is easy to deploy. The privileged NT service would register and listen on an LPC port (NtCreatePort,
NtListenPort). The NM would use JNI to interop with libwinutils which would host the LPC client
code. The client would connect to the LPC port (NtConnectPort) and send a message requesting
a container launch (NtRequestWaitReplyPort). LPC provides authentication and the privileged
NT service can use authorization API (AuthZ) to validate the caller.

This message was sent by Atlassian JIRA

View raw message