hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
Date Fri, 19 Jun 2015 16:55:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593627#comment-14593627

Zhijie Shen commented on YARN-3051:

First of all, I'd like to say it's not the finalized the reader API, but the one we are okay
to start with: two types of query, and the set of essential parameters, which focus on tuning
what entities to return. We can definitely iterate over the APIs to add more parameters to
trim the results, and to control sub-entity information.

bq. We had decided that user may not need to retrieve all the configs and metrics and hence
we should have a parameter to indicate that ? A list of metrics and confs user wants to retrieve
? For both the APIs'. I had included this in the patch I had made. Do we need it ?

Yeah, we could have these parameters, but I'm wondering the efficient way to retrieve part
of the configs/metrics in a huge set. For example, if I'm interested in all the mapred configs
of my job. What should I do? Enumerate all the mapred configs I want to retrieve in the query
parameter is a nightmare. My immediate thought about it is regex, but I don't want to include
this parameter into the original version until we're clear about how to specify it.

bq. Shouldn't we have metrics filters to support queries like fetch entities which have a
metric > a certain value. In the patch I had included support for relational operators.

We should. See my TODO comment. The problem again is that it's not a simple predicate. How
do we want to abstract and support it? You give the example ">", but we need to take care
of "<", "=", "!=", "like" and so on.

bq. We do not need flowId and flowRunId to get an entity. But it can still be an optional
argument so that we avoid peek into the table which gets them based on cluster and appid.
Thoughts ?

Yeah, it makes sense to. Image we have the web UI, and user is directed from flow page to
the app page and move on, he's going to carry the flow information. If user can provide flowId//flowRunId,
we can more efficiently locate the entity. We can have the two params, make them optional.
Also, it seems that I've missed userId too. It's the first piece that the consists of the
entity key. IMHO, we should have it and make it mandatory to avoid scan through the whole
key space. And It should be reasonable that we take the requester as the user and only search
into his entity space, but not others.

bq. Will we fetch entities across entityTypes ? We also have events as filters here. They
may not match across entity types. Thoughts ?

Good point, let's go with single entityType first.

bq. As per our previous discussion I had also included metrics time windows in the APIs'.
This may aid in plotting graphs for long running apps. Thoughts ?

This seems to belong to (contents to retrieve), and not difficult to enforce the window. We
can add this into the param list. One question is whether we want to specify the window per
metric or for all metrics. Personally, I prefer to defer it together with fetching particular
configs/metrics in a later enhancement about (contents to retrieve). How do you think?

I've updated the Reader interface accordingly.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---------------------------------------------------------------------------
>                 Key: YARN-3051
>                 URL: https://issues.apache.org/jira/browse/YARN-3051
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Varun Saxena
>         Attachments: YARN-3051-YARN-2928.003.patch, YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch,
YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, YARN-3051.wip.02.YARN-2928.patch,
YARN-3051.wip.patch, YARN-3051_temp.patch
> Per design in YARN-2928, create backing storage read interface that can be implemented
by multiple backing storage implementations.

This message was sent by Atlassian JIRA

View raw message