Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DBDA0200C5E for ; Sat, 18 Mar 2017 00:15:51 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id DA768160B80; Fri, 17 Mar 2017 23:15:51 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0AA84160B8C for ; Sat, 18 Mar 2017 00:15:50 +0100 (CET) Received: (qmail 20641 invoked by uid 500); 17 Mar 2017 23:15:50 -0000 Mailing-List: contact yarn-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-dev@hadoop.apache.org Received: (qmail 20517 invoked by uid 99); 17 Mar 2017 23:15:50 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Mar 2017 23:15:50 +0000 Received: from mail-vk0-f49.google.com (mail-vk0-f49.google.com [209.85.213.49]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id AF2EE1A0976 for ; Fri, 17 Mar 2017 23:15:49 +0000 (UTC) Received: by mail-vk0-f49.google.com with SMTP id x75so48584301vke.2 for ; Fri, 17 Mar 2017 16:15:49 -0700 (PDT) X-Gm-Message-State: AFeK/H2DboAkLgR7S5SlSZB0i4QGj2tqG1/OkvaFTknrv6Tqs+KaB9QTg/xoGbhgnwFCpkkIkISrv9/M/eRYUg== X-Received: by 10.31.165.148 with SMTP id o142mr1588967vke.85.1489792548580; Fri, 17 Mar 2017 16:15:48 -0700 (PDT) MIME-Version: 1.0 Reply-To: subru@apache.org Received: by 10.103.149.144 with HTTP; Fri, 17 Mar 2017 16:15:47 -0700 (PDT) In-Reply-To: References: <2023140125.1703150.1489700071731.ref@mail.yahoo.com> <2023140125.1703150.1489700071731@mail.yahoo.com> From: Subru Krishnan Date: Fri, 17 Mar 2017 16:15:47 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Two AMs in one YARN container? To: Arun Suresh Cc: dev@reef.apache.org, Jason Lowe , Sergiy Matusevych , "yarn-dev@hadoop.apache.org" , Chris Douglas , Markus Weimer , Daryn Sharp , Botong Huang Content-Type: multipart/alternative; boundary=001a1142e4b0a5ba89054af55bc8 archived-at: Fri, 17 Mar 2017 23:15:52 -0000 --001a1142e4b0a5ba89054af55bc8 Content-Type: text/plain; charset=UTF-8 Thanks Arun for the heads-up. Hi Sergiy, We do run an UAM pool under one process (AMRMProxyService in NM) as that's the mechanism we use to span a single job across multiple clusters that are under federation. This is achieved by using the doAs method in UserGroupInformation, exactly as Jason pointed out. The e2e *prototype* code (and docs/slides) is available in the Federation umbrella jira: https://issues.apache.org/jira/browse/YARN-2915 I have created a utility class that's used throughout YARN Federation to create RMProxies per UGI - FederationProxyProviderUtil (as part of YARN-3673 ), which should provide a good starting point for you. You should also keep an eye on UAM pool JIRA which Botong is working on right now: https://issues.apache.org/jira/browse/YARN-5531 -Subru On Thu, Mar 16, 2017 at 2:49 PM, Arun Suresh wrote: > Hey Sergiy, > > I think a similar approach IIUC, where an AM for a app running on a > cluster acts as an unmanaged AM on another cluster. I believe they use a > separate UGI for each sub-cluster and wrap it around a doAs before the > actual allocate call. > > Subru might be able to give more details. > > Cheers > -Arun > > On Thu, Mar 16, 2017 at 2:34 PM, Jason Lowe > wrote: > >> The doAs method in UserGroupInformation is what you want when dealing >> with multiple UGIs. It determines what UGI instance the code within the >> doAs scope gets when that code tries to lookup the current user. >> Each AM is designed to run in a separate JVM, so each has some >> main()-like entry point that does everything to setup the AM. >> Theoretically all you need to do is create two, separate UGIs then use each >> instance to perform a doAs wrapping the invocation of the corresponding >> AM's entry point. After that, everything that AM does will get the UGI of >> the doAs invocation as the current user. Since the AMs are running in >> separate doAs instances they will get separate UGIs for the current user >> and thus separate credentials. >> Jason >> >> >> On Thursday, March 16, 2017 4:03 PM, Sergiy Matusevych < >> sergiy.matusevych@gmail.com> wrote: >> >> >> Hi Jason, >> >> Thanks a lot for your help again! Having two separate >> UserGroupInformation instances is exactly what I had in mind. What I do not >> understand, though, is how to make sure that our second call to >> .regsiterApplicationMaster() will pick the right UserGroupInformation >> object. I would love to find a way that does not involve any changes to the >> YARN client, but if we have to patch it, of course, I agree that we need to >> have a generic yet minimally invasive solution. >> Thank you!Sergiy. >> >> >> On Thu, Mar 16, 2017 at 8:03 AM, Jason Lowe wrote: >> > >> > I believe a cleaner way to solve this problem is to create two, >> _separate_ UserGroupInformation objects and wrap each AM instances in a UGI >> doAs so they aren't trying to share the same credentials. This is one >> example of a token bleeding over and causing problems. I suspect trying to >> fix these one-by-one as they pop up is going to be frustrating compared to >> just ensuring the credentials remain separate as if they really were >> running in separate JVMs. >> > >> > Adding Daryn who knows a lot more about the UGI stuff so he can correct >> any misunderstandings on my part. >> > >> > Jason >> > >> > >> > On Wednesday, March 15, 2017 1:11 AM, Sergiy Matusevych < >> sergiy.matusevych@gmail.com> wrote: >> > >> > >> > Hi YARN developers, >> > >> > I have an interesting problem that I think is related to YARN Java >> client. >> > I am trying to launch *two* application masters in one container. To be >> > more specific, I am starting a Spark job on YARN, and launch an Apache >> REEF >> > Unmanaged AM from the Spark Driver. >> > >> > Technically, YARN Resource Manager should not care which process each AM >> > runs in. However, there is a problem with the YARN Java client >> > implementation: there is a global UserGroupInformation object that holds >> > the user credentials of the current RM session. This data structure is >> > shared by all AMs, and when REEF application tries to register the >> second >> > (unmanaged) AM, the client library presents to YARN RM all credentials, >> > including the security token of the first (managed) AM. YARN rejects >> such >> > registration request, throwing InvalidApplicationMasterRequestException >> > "Application Master is already registered". >> > >> > I feel like this issue can be resolved by a relatively small update to >> the >> > YARN Java client - e.g. by introducing a new variant of the >> > AMRMClientAsync.registerApplicationMaster() that would take the >> required >> > security token (instead of getting it implicitly from the >> > UserGroupInformation.getCurrentUser().getCredentials() etc.), or having >> > some sort of RM session class that would wrap all data that is currently >> > global. I need to think about the elegant API for it. >> > >> > What do you guys think? I would love to work on this problem and send >> you a >> > pull request for the upcoming 2.9 release. >> > >> > Cheers, >> > Sergiy. >> > >> > >> >> >> >> > > --001a1142e4b0a5ba89054af55bc8--