reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergiy Matusevych (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (REEF-1747) Workaround to allow two AMs in one YARN container
Date Tue, 18 Apr 2017 23:57:41 GMT

     [ https://issues.apache.org/jira/browse/REEF-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sergiy Matusevych closed REEF-1747.
-----------------------------------
    Resolution: Fixed

Turns out we don't need a workaround. Proper fix implemented in [REEF-1776]

> Workaround to allow two AMs in one YARN container
> -------------------------------------------------
>
>                 Key: REEF-1747
>                 URL: https://issues.apache.org/jira/browse/REEF-1747
>             Project: REEF
>          Issue Type: Bug
>          Components: REEF Runtime YARN, REEF-Runtime-YARN
>            Reporter: Sergiy Matusevych
>            Assignee: Sergiy Matusevych
>            Priority: Critical
>              Labels: workaround
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> The current version of YARN Java client does not allow us to register two Application
Masters running in the same process. Technically, YARN Resource Manager does not care which
process each AM runs in. However, there is a problem with the YARN Java client implementation:
this library contains a singleton {{UserGroupInformation}} object that holds the user credentials
of the current RM session. This data structure is shared by all AMs, and when REEF application
tries to register the second (unmanaged) AM, the client library presents to YARN RM _all_
credentials, including the security token of the _first_ (managed) AM. YARN rejects such registration
request, throwing {{InvalidApplicationMasterRequestException}} _"Application Master is already
registered"._
> A proper fix for this issue would be a patch for Hadoop YARN Java client, that would
allow us to pass the required security token into the {{AMRMClientAsync.registerApplicationMaster()}}
call. We also need a quick workaround in REEF so we can run REEF-on-REEF and Spark+REEF applications
using unpatched Hadoop libraries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message