hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3539) Compatibility doc to state that ATS v1 is a stable REST API
Date Tue, 28 Apr 2015 21:54:06 GMT

    [ https://issues.apache.org/jira/browse/YARN-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518163#comment-14518163
] 

Zhijie Shen commented on YARN-3539:
-----------------------------------

Steve, thanks for consolidating the patch. Here're some of my comments and thoughts.

bq. What is essential is that all the existing operations must not change, so that shipping
applications do not break.

Yeah, we can retain v1 APIs (this is actually we're doing now), but problem is around "do
not break". Does it mean ATS v2 should be compatible with v1 APIs? In other word, do we support
that user's old app uses v1 client to talk to v2 server?

bq. is is critical to declare that ATSv1 is stable. Without that guarantee, it is impossible
for any application to commit to using the APIs. 
bq. Spark depends on this for the SPARK-1537 feature, some ongoing worth with Accumulo depends
on this, when Slider adds ATS support we'll depend on this stability guarantee, etc, etc.

I pretty understand the desirability of stable APIs. However, I can see TEZ and Hive/Pig on
TEZ started integrating the service even without our declaring the APIs stable. Though the
APIs is not declared as stable, it didn't mean we're keeping changing it from release to release.
Instead, the reality is that the timeline API is almost compatible since 2.4. Marking it as
unstable before is more like reserving the right to change it for improving the service. So
I'm not sure if it's good timeline now, as we foresee in the near future, we're going to be
upgraded to ATS v2, which may significantly refurnish the APIs.

bq. One area that is not covered in the ATSv1 API is what constitutes a valid entity type
or domain?.

Do you mean the mandatory fields? For entity, they're type, id and starttime (which can be
optional if the entity containsn at least one event). For event, they are type and timestamp.
For domain, they're id.

bq. There is also the fact that the /domain path was added under /ws/v1/timeline/, so matches
the path of entity types. Can you have an entity type called "domain"? Was it previously possible?

We cannot. "timeline/domain" blocks the entity type "domain" after domain feature is added.
I think we should state it in the documentation (perhaps we wan't to reserve more names for
future use). Other than this, I think we shouldn't have any other obligation for naming the
identifier.

bq. strictly defining what constitutes a valid entity type via a regular expression, and declaring
whether the types are case sensitive.

This is a good idea. We can define the char set and the pattern to prevent users to define
random names, but I'm not sure if it is easy to put into practice. The question is whether
we're going to break the existing users who have already defined the names that won't match
our future regex.


Some comments about the patch:

1. For the bullet points of "Current Status and Future Plans", can we organize them a bit
better. For example, we partition them into the groups of  a) current status and b) future
plans. For bullet 4, not just history, but all timeline data.

2. Can we move "Timeline Server REST API" section before "Generic Data REST APIs"?

3. Application elements table seems to be wrongly formatted. I think that's why site compilation
is failed.

4. "Generic Data REST APIs" output examples need to be slightly updated. Some more fields
are added or changed.

5. "Timeline Server REST API" output examples are not genuine. Perhaps, we can run a simple
MR example job, and get the up-to-date timeline entity and application info to show as the
examples.

One additional stuff that is not covered by the documentation is the entity uniqueness. In
v1, an entity is globally identified by <type, id>. It means if user1 has posted <type1,
id1> in his application, user2 cannot pos the entity with the same identifier in his application
even they're completely irrelevant. Therefore, users are suggested to come up with unique
entity type for their framework to avoid the namespace collision.



> Compatibility doc to state that ATS v1 is a stable REST API
> -----------------------------------------------------------
>
>                 Key: YARN-3539
>                 URL: https://issues.apache.org/jira/browse/YARN-3539
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: documentation
>    Affects Versions: 2.7.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-11826-001.patch, HADOOP-11826-002.patch, YARN-3539-003.patch,
YARN-3539-004.patch
>
>
> The ATS v2 discussion and YARN-2423 have raised the question: "how stable are the ATSv1
APIs"?
> The existing compatibility document actually states that the History Server is [a stable
REST API|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#REST_APIs],
which effectively means that ATSv1 has already been declared as a stable API.
> Clarify this by patching the compatibility document appropriately



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message