hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergiy Matusevych <sergiy.matusev...@gmail.com>
Subject Re: Two AMs in one YARN container?
Date Sat, 18 Mar 2017 00:55:37 GMT
On Fri, Mar 17, 2017 at 4:15 PM, Subru Krishnan <subru@apache.org> wrote:


> Thanks Arun for the heads-up.
>
> Hi Sergiy,
>
> We do run an UAM pool under one process (AMRMProxyService in NM) as that's
> the mechanism we use to span a single job across multiple clusters that are
> under federation. This is achieved by using the doAs method in
> UserGroupInformation, exactly as Jason pointed out.
>
> The e2e *prototype* code (and docs/slides) is available in the Federation
> umbrella jira:
> https://issues.apache.org/jira/browse/YARN-2915
>
> I have created a utility class that's used throughout YARN Federation to
> create RMProxies per UGI - FederationProxyProviderUtil
> <https://github.com/apache/hadoop/blob/YARN-2915/hadoop-
> yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-
> yarn-server-common/src/main/java/org/apache/hadoop/yarn/
> server/federation/failover/FederationProxyProviderUtil.java>
> (as part of YARN-3673 <https://issues.apache.org/jira/browse/YARN-3673>),
> which should provide a good starting point for you.
>
> You should also keep an eye on UAM pool JIRA which Botong is working on
> right now:
> https://issues.apache.org/jira/browse/YARN-5531



Hi YARN devs,

*Huge* thanks for your help! If I understand you correctly, that means I do
not need any changes to YARN client API to run multiple AMs in one process
- an excellent news!

I will study the federation code and try that technique in REEF. I'll let
you know how it goes.

Again, thanks a lot Subru, Arun, and Jason -- you guys are awesome :)

Cheers,
Sergiy.



> On Thu, Mar 16, 2017 at 2:49 PM, Arun Suresh <arun.suresh@gmail.com>
> wrote:
>
> > Hey Sergiy,
> >
> > I think a similar approach IIUC, where an AM for a app running on a
> > cluster acts as an unmanaged AM on another cluster. I believe they use a
> > separate UGI for each sub-cluster and wrap it around a doAs before the
> > actual allocate call.
> >
> > Subru might be able to give more details.
> >
> > Cheers
> > -Arun
> >
> > On Thu, Mar 16, 2017 at 2:34 PM, Jason Lowe <jlowe@yahoo-inc.com.invalid
> >
> > wrote:
> >
> >> The doAs method in UserGroupInformation is what you want when dealing
> >> with multiple UGIs.  It determines what UGI instance the code within the
> >> doAs scope gets when that code tries to lookup the current user.
> >> Each AM is designed to run in a separate JVM, so each has some
> >> main()-like entry point that does everything to setup the AM.
> >> Theoretically all you need to do is create two, separate UGIs then use
> each
> >> instance to perform a doAs wrapping the invocation of the corresponding
> >> AM's entry point.  After that, everything that AM does will get the UGI
> of
> >> the doAs invocation as the current user.  Since the AMs are running in
> >> separate doAs instances they will get separate UGIs for the current user
> >> and thus separate credentials.
> >> Jason
> >>
> >>
> >>     On Thursday, March 16, 2017 4:03 PM, Sergiy Matusevych <
> >> sergiy.matusevych@gmail.com> wrote:
> >>
> >>
> >>  Hi Jason,
> >>
> >> Thanks a lot for your help again! Having two separate
> >> UserGroupInformation instances is exactly what I had in mind. What I do
> not
> >> understand, though, is how to make sure that our second call to
> >> .regsiterApplicationMaster() will pick the right UserGroupInformation
> >> object. I would love to find a way that does not involve any changes to
> the
> >> YARN client, but if we have to patch it, of course, I agree that we
> need to
> >> have a generic yet minimally invasive solution.
> >> Thank you!Sergiy.
> >>
> >>
> >> On Thu, Mar 16, 2017 at 8:03 AM, Jason Lowe <jlowe@yahoo-inc.com>
> wrote:
> >> >
> >> > I believe a cleaner way to solve this problem is to create two,
> >> _separate_ UserGroupInformation objects and wrap each AM instances in a
> UGI
> >> doAs so they aren't trying to share the same credentials.  This is one
> >> example of a token bleeding over and causing problems. I suspect trying
> to
> >> fix these one-by-one as they pop up is going to be frustrating compared
> to
> >> just ensuring the credentials remain separate as if they really were
> >> running in separate JVMs.
> >> >
> >> > Adding Daryn who knows a lot more about the UGI stuff so he can
> correct
> >> any misunderstandings on my part.
> >> >
> >> > Jason
> >> >
> >> >
> >> > On Wednesday, March 15, 2017 1:11 AM, Sergiy Matusevych <
> >> sergiy.matusevych@gmail.com> wrote:
> >> >
> >> >
> >> > Hi YARN developers,
> >> >
> >> > I have an interesting problem that I think is related to YARN Java
> >> client.
> >> > I am trying to launch *two* application masters in one container. To
> be
> >> > more specific, I am starting a Spark job on YARN, and launch an Apache
> >> REEF
> >> > Unmanaged AM from the Spark Driver.
> >> >
> >> > Technically, YARN Resource Manager should not care which process each
> AM
> >> > runs in. However, there is a problem with the YARN Java client
> >> > implementation: there is a global UserGroupInformation object that
> holds
> >> > the user credentials of the current RM session. This data structure is
> >> > shared by all AMs, and when REEF application tries to register the
> >> second
> >> > (unmanaged) AM, the client library presents to YARN RM all
> credentials,
> >> > including the security token of the first (managed) AM. YARN rejects
> >> such
> >> > registration request, throwing InvalidApplicationMasterReques
> tException
> >> > "Application Master is already registered".
> >> >
> >> > I feel like this issue can be resolved by a relatively small update to
> >> the
> >> > YARN Java client - e.g. by introducing a new variant of the
> >> > AMRMClientAsync.registerApplicationMaster() that would take the
> >> required
> >> > security token (instead of getting it implicitly from the
> >> > UserGroupInformation.getCurrentUser().getCredentials() etc.), or
> having
> >> > some sort of RM session class that would wrap all data that is
> currently
> >> > global. I need to think about the elegant API for it.
> >> >
> >> > What do you guys think? I would love to work on this problem and send
> >> you a
> >> > pull request for the upcoming 2.9 release.
> >> >
> >> > Cheers,
> >> > Sergiy.
> >> >
> >> >
> >>
> >>
> >>
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message