hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohith Sharma K S <rohithsharm...@huawei.com>
Subject RE: Securely discovering Application Master's metadata or sending a secret to Application Master at submission
Date Fri, 10 Jun 2016 05:55:36 GMT
Hi

Basically I see you have multiple questions

1.       How to get AM RPC port ?

>>> This you can get it via YarnClient# getApplicationReport(). This gives common/generic
application specific details. Note that RM does not maintain any custom details for applications.

2.       How can you get metadata of AM?

>>> Basically AM design should be such that bind an interface to AM RPC. And AM-RPC
host and port can be obtained from ResourceManager. Using host:port of AM from application
submitter,  connect to AM and get required details from AM only. To achieve this , YARN does
not provide any interface since AM are written users. Essentially, user can design AM to expose
client interface to their clients. For your better understanding , see MapReduce framework
MRAppMaster.

3.       About the authenticity of job-submitter to AM

>>> Use secured hadoop cluster with Kerberos enabled. Note that AM also should be
implemented for handling Kerberos.


Thanks & Regards
Rohith Sharma K S

From: Mingyu Kim [mailto:mkim@palantir.com]
Sent: 10 June 2016 03:47
To: Rohith Sharma K S; user@hadoop.apache.org
Cc: Matt Cheah
Subject: Re: Securely discovering Application Master's metadata or sending a secret to Application
Master at submission

Hi Rohith,

Thanks for the pointers. I checked the Hadoop documentation you linked, but it’s not clear
how I can expose client interface for providing metadata. By “YARN internal communications”,
I was referring to the endpoints that are exposed by AM on the RPC port as reported in ApplicationReport.
I assume either RM or containers will communicate with AM through these endpoints.

I believe your suggestion is to expose additional endpoints to the AM RPC port. Can you clarify
how I can do that? Is there an interface/class I need to extend? How can I register the extra
endpoints for providing metadata on the existing AM RPC port?

Mingyu

From: Rohith Sharma K S <rohithsharmaks@huawei.com<mailto:rohithsharmaks@huawei.com>>
Date: Wednesday, June 8, 2016 at 11:15 PM
To: Mingyu Kim <mkim@palantir.com<mailto:mkim@palantir.com>>, "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
<user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Cc: Matt Cheah <mcheah@palantir.com<mailto:mcheah@palantir.com>>
Subject: RE: Securely discovering Application Master's metadata or sending a secret to Application
Master at submission

Hi

Do you know how I can extend the client interface of the RPC port?
>>> YARN provides YARNClIent library that uses ApplicationClientProtocol. For your
more understanding refer https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html#Writing_a_simple_Client<https://urldefense.proofpoint.com/v2/url?u=https-3A__hadoop.apache.org_docs_stable_hadoop-2Dyarn_hadoop-2Dyarn-2Dsite_WritingYarnApplications.html-23Writing-5Fa-5Fsimple-5FClient&d=DQMGaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=ennQJq47pNnObsDh-88a9YUrUulcYQoV8giPASqXB84&m=5pHc0M-1BOxtbvvaoT6ahycddGtWm-uq9f5JW_FJRQM&s=S9H5l9wo0JK9Oet5_GiN-lW4lQBxkaC1mxPDRY1kGpk&e=>

I know AM has some endpoints exposed through the RPC port for internal YARN communications,
but was not sure how I can extend it to expose a custom endpoint.
>>> I am not sure what you mean here internal YARN communication? AM can connect
to RM only via AM-RM interface for register/unregister and heartbeat and details sent to RM
are limited.  It is up to the AM’s to expose client interface for providing metadata.
Thanks & Regards
Rohith Sharma K S
From: Mingyu Kim [mailto:mkim@palantir.com]
Sent: 09 June 2016 11:21
To: Rohith Sharma K S; user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Cc: Matt Cheah
Subject: Re: Securely discovering Application Master's metadata or sending a secret to Application
Master at submission

Hi Rohith,

Thanks for the quick response. That sounds promising. Do you know how I can extend the client
interface of the RPC port? I know AM has some endpoints exposed through the RPC port for internal
YARN communications, but was not sure how I can extend it to expose a custom endpoint. Any
pointer would be appreciated!

Mingyu

From: Rohith Sharma K S <rohithsharmaks@huawei.com<mailto:rohithsharmaks@huawei.com>>
Date: Wednesday, June 8, 2016 at 10:39 PM
To: Mingyu Kim <mkim@palantir.com<mailto:mkim@palantir.com>>, "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
<user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Cc: Matt Cheah <mcheah@palantir.com<mailto:mcheah@palantir.com>>
Subject: RE: Securely discovering Application Master's metadata or sending a secret to Application
Master at submission

Hi

Apart from AM address and tracking URL, no other meta data of applicationMaster are stored
in YARN. May be AM can expose client interface so that AM clients can interact with Running
AM to retrieve specific AM details.

RPC port of AM can be get from YARN client interface such as ApplicationClientProtocol# getApplicationReport()
OR ApplicationClientProtocol #getApplicationAttemptReport().

Thanks & Regards
Rohith Sharma K S

From: Mingyu Kim [mailto:mkim@palantir.com]
Sent: 09 June 2016 10:36
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Cc: Matt Cheah
Subject: Securely discovering Application Master's metadata or sending a secret to Application
Master at submission

Hi all,

To provide a bit of background, I’m trying to deploy a REST server on Application Master
and discover the randomly assigned port number securely. I can easily discover the host name
of AM through YARN REST API, but the port number needs to be discovered separately. (Port
number is assigned within a specified range with retries to avoid port conflicts) An easy
solution would be to have Application Master make a callback with the port number, but I’d
like to design it such that YARN nodes don’t talk back to the node that submitted the YARN
application. So, this problem reduces to securely discovering a small metadata of Application
Master. To be clear, by being secure, I’m less concerned about exposing the information
to others, but more concerned about the integrity of data (e.g. the metadata actually originated
from the Application Master.)

I was hoping that there is a way to register some Application Master metadata to Resource
Manager, but there doesn’t seem to be a way. Another option I considered was to write the
information to a HDFS file, but in order to verify the integrity of the content, I need a
way to securely send a private key to Application Master, which I’m not sure what the best
is.

To recap, does anyone know if there is a way

•         To register small metadata securely from Application Master to Resource Manager
so that it can be discovered by the YARN application submitter?

•         Or, to securely send a private key to Application Master at the application submission
time?

Thanks a lot,
Mingyu
Mime
View raw message