hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5531) UnmanagedAM pool manager for federating application across clusters
Date Wed, 10 May 2017 22:30:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005569#comment-16005569
] 

Karthik Kambatla commented on YARN-5531:
----------------------------------------

Thanks for working on this, [~botong]. I took a close look at the new files and skimmed through
the remaining. Comments:
# Is yarn-server the best place for these? In the future, don't we want other clients to use
this UAMPool? If we do change it to a different package, we need to think about the Visibility
and Stability annotations. 
# UnmanagedAMPoolManager:
## The create methods seem to be expect AppAttemptId from the user. Is that reasonable? Should
it be the other way round, where we give the user AppAttemptId for the new app created?
## What are the benefits of using maps keyed by String, passed by the user. Why not just use
ApplicationAttemptId? create methods could just return the app-attempt? 
## Nit: In serviceStart, when creating maps, no need to specify the types starting Java 7.
## In serviceStart and serviceStop, shouldn't we call the equivalue super. methods right at
the end? Otherwise, the state machine would transition the service to INITED or STOPPED even
if it is not fully in that state? 
## serviceStop
### I see the code tries to parallelize killing AMs. Is this necessary? How bad is sequential
killing of apps? 
### Nit: ExecutionCompletionService doesn't need the type in the creation. 
### Why do we need the lock on the uamMap? 
### Nit: Style choice. Where possible, I like to avoid nesting. The isEmpty check is for the
logging. Can we not have the for nested. 
### If we fail to kill the application, is catching the exception enough? Is there merit to
retrying? Should we capture this state and throw an exception past this loop?
## createUAM should be annotated @VisibleForTesting
## Nit: allocateAsync: Don't see the need for variable uam. 
## finishAM
### Nit: Don't see the need for variable uam. 
### Don't we need to handle the case where the app is still registered? Retry?
# UnmanagedApplicationManager
## Should this class be called UnmanagedApplicationMaster?
## Constructor: Don't need to specify type when creating LinkedBlockingQueue
## UnmanagedAMLauncher
### It is not clear to me that this needs to be a separate inner class, outside of grouping
methods that create an AM.
### submitAndGetAppId doesn't seem to really get app id? 
### Why not use YarnClient? I understand this UAM pool is currently in yarn-server, but once
we move this out, it should be easier. 
### Would it be possible to have a single monitor method?
### Isn't one second too long a wait in monitor* methods?
## UnmanagedAMIdentifier can be private, so can be its methods. 
## CallbackHandlerThread
### Can the combination of requestQueue and CallbackHandlerThread be achieved using a dispatcher?

### Should this thread be named HeartbeatHandlerThread or AMRequestHandlerThread? The thread
is processing requests.
### We seem to throw RuntimeExceptions. Should these be YarnExceptions instead?
### Since the thread can crash, it is nicer to implement an UncaughtExceptionhandler for this
thread? 
## finishApplicationMaster
### Can the two {{if (rmProxy == null)}} checks be merged into one?
### Should the {{rmProxy.finishApplicationMaster}} be in a loop? Or, is one check and re-register
enough?
## allocateAsync
### Is it okay to ignore the InterruptedException?
### The warning on UAM not being launched/registered seems unnecessary.
### Should the {{rmProxy == null && registerRequest == null}} check be first before
we even queue this request?



> UnmanagedAM pool manager for federating application across clusters
> -------------------------------------------------------------------
>
>                 Key: YARN-5531
>                 URL: https://issues.apache.org/jira/browse/YARN-5531
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Subru Krishnan
>            Assignee: Botong Huang
>         Attachments: YARN-5531-YARN-2915.v10.patch, YARN-5531-YARN-2915.v1.patch, YARN-5531-YARN-2915.v2.patch,
YARN-5531-YARN-2915.v3.patch, YARN-5531-YARN-2915.v4.patch, YARN-5531-YARN-2915.v5.patch,
YARN-5531-YARN-2915.v6.patch, YARN-5531-YARN-2915.v7.patch, YARN-5531-YARN-2915.v8.patch,
YARN-5531-YARN-2915.v9.patch
>
>
> One of the main tenets the YARN Federation is to *transparently* scale applications across
multiple clusters. This is achieved by running UAMs on behalf of the application on other
clusters. This JIRA tracks the addition of a UnmanagedAM pool manager for federating application
across clusters which will be used the FederationInterceptor (YARN-3666) which is part of
the AMRMProxy pipeline introduced in YARN-2884.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message