hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gour Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5610) Initial code for native services REST API
Date Thu, 08 Sep 2016 17:55:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15474563#comment-15474563

Gour Saha commented on YARN-5610:

[~jianhe] thank you for reviewing the patch. Addressing the first set of comments/feedback.
Will upload the patch with the code changes.

h4. API Models:
\{artifact, resource, launch_command, number_of_containers\}
in Application seems duplicated with those inside the component. I feel in this scenario,
a default global setting for artifacts, launch_command etc. is not that appropriate, different
components may likely have different requirements. IMHO, we only need the ones in Component,
this makes the interface cleaner and underlying implementation simpler?
The reason these attributes are at the Application level as well is because there will be
simple applications which will not have any components. For these simple app-definitions forcing
app-owners to create components is not going to be very user-friendly.

unique_component_support: what is the primary use-case to have distinct component name ?
This is SLIDER-1100. It is a very powerful feature where app-owners don't need to define multiple
roles of a component with almost everything same, except a few configurations. This attribute
is how they express this feature through the API. More details can be found in the Slider

What is the BaseResource object for? Why does Application, ApplicationStatus, Container, Resource
need to extend this class?
This class is to host common attributes of the resources. To start with we have uri. My guess
is the list will grow.

What does the Artifact#APPLICATION mean ?
In case, of complex and nested applications some components will be by themselves full blown
and independent applications itself. The APPLICATION type artifact refers to such external
application definitions, compared to the simpler artifact types like a docker image.

ApplicationState: What is difference RUNNNG vs STARTED, FINISHED vs STOPPED
STARTED is when Yarn has moved an application from ACCEPTED to RUNNING state, but according
to app-owner the application is not running/useful until IPs gets assigned, DNS entry gets
added and/or the app reaches a stable running state where it can start serving end-user requests.
So according to an app-owner, it is STARTED but not RUNNING yet. So both these states are
helpful since STARTED tells them that it has been deployed by Yarn and RUNNING tells them
that their application is ready to serve requests. I think FINISHED needs to be removed as
STOPPED is good enough. Let me look further why it was introduced.

Application#lifetime: it is String type. Does this mean we have to define a scheme for user
to specify the time in string format? How about just using long type ?
That is because app owners can specify the time with a unit such as 30mins or 10hours or 20days.
The swagger definition defines this. The implementation has been kept simple for now, but
will support the units.

ApplicationStatus#errorMessage, how about call it diagnostics ? sometimes we may also return
non-error messages.
This is a good point. Let me change the definition.

h4. Implementation:
“hadoop-yarn-services-api” should be under hadoop-yarn-slider module as peer to hadoop-yarn-slider-core
I think we need to discuss this further since very likely the REST service will be running
in the RM JVM so it might go into the resourcemanager module. But let’s discuss this further.

why the changes needed in hadoop-project/pom.xml
Seems like the top level dependencies are specified here, hence had to add swagger and its
content related dependencies. Am I wrong, should the top-level dependencies be specified in
some other pom?

We should not use a deprecated getPort() method logger.info("Listening at port = {}", applicationApiServer.getPort());,
jenkins will report error.

couple of things for below code

HADOOP_CONFIG = getHadoopConfigs();

SLIDER_CONFIG = getSliderClientConfiguration();
We cannot load hdfs config, that's for hdfs servers. Any reason you need the hdfs configs?
Agreed. Removed.

Instead of calling these two methods, I think we can just call YarnConfiguration yarnConf
= new YarnConfiguration(). This will automatically load the yarn-site and core-site configs.

Why do we need to explicitly call initHadoopBinding, which is already called the super.init()
Good catch. Removed.

These two catch clauses are identical, and Exception extends Throwable, so we only need catch
Throwable, if that's desired.
Agreed. Removed the Throwable catch block.

This will never return null, because the numberOfContainers is intialized as 1. you might
want to check zero ?
      if (application.getNumberOfContainers() == null) \{
        throw new IllegalArgumentException(ERROR_CONTAINERS_COUNT_INVALID);
Actually, if you pass a JSON for the create/POST call with number_of_containers set to empty
string “” or null the Application object is created with numberOfContainers set to null.
Hence that check is there. Zero is an acceptable number since there might be certain components
which will be defined but no instances will be started in the beginning. After the app is
started these components at some point will likely be flexed up from 0 to some n no of instances
(n > 0).

The lifetime field will never be null, because it is intilized as "unlimited" by default
Again for the same reason as numberOfContainers if in the JSON lifetime been explicitly set
to “” or null this check is required.

IIUC, all these code are not needed, because appOptions is only used for logging, uniqueGlobalPropertyCache
is not used logically, Python is not required any more in yarn-slider
Python code has been removed. appOptions and uniqueGlobalPropertyCache are required as appOptions
is the way application configuration properties are injected into Slider client.

In agent-less world, the status command is probably not required. We need a different mechanism
to determine container status. let's remove this for now

appConfOptTriples.addAll(Arrays.asList(compName, configPrefix.toLowerCase()
+ ".statusCommand", DEFAULT_STATUS_CMD));
Agreed. Removed.

remove the unused parameter globalConf in createAppConfigComponent
I have some TODOs on global config which I wanted to get back to. Let me check and remove
if not required.

remove unused method createAppConfigGlobal
I remember writing some comment as to why this method is required. Let me check and remove
if not required.

remove the commented out code in this class, there are quite a few places
Most of the commented code is related to Slider AM resource setting and global config. I will
revisit this in the next patch and clean up if not required.

Can you please explain what this code tries to accomplish ? it's hard to understand what it
tries to do.
Sorry I should have added some comments there. This is what I added now -
    // If artifact is of type APPLICATION, then in the POST JSON there will
    // be no component definition for that artifact. Hence it's corresponding id
    // field is added. Every external APPLICATION has a unique id field.
    List<String> convertedDeps = new ArrayList<>();

    // If the DNS dependency property is set to true for a component, it means
    // that it is ensured that DNS entry has been added for all the containers
    // of this component, before moving on to the next component in the DAG.
    if (hasPropertyWithValue(component, PROPERTY_DNS_DEPENDENCY, "true")) {

The lifetime is tied to the application according to the API, why does it need to pass down
to every component? IIUC, the flow should be, once the app timeouts, all components for this
app timeout.
This is a dirty way how lifetime is handled in Slider. It will be thrown away once YARN-3813
is used to set and manage lifetime.

> Initial code for native services REST API
> -----------------------------------------
>                 Key: YARN-5610
>                 URL: https://issues.apache.org/jira/browse/YARN-5610
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Gour Saha
>            Assignee: Gour Saha
>         Attachments: YARN-4793-yarn-native-services.001.patch
> This task will be used to submit and review patches for the initial code drop for the
native services REST API 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message