hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs
Date Thu, 27 Aug 2015 17:44:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717146#comment-14717146
] 

Sangjin Lee commented on YARN-4074:
-----------------------------------

cc [~gtCarrera9] and [~vrushalic] also for their thoughts.

There are some options for this, and there are pros and cons. I'm leaning towards the current
proposal ((1) below) for now, but we could enhance this later as the UI jells more.

# do a specific entity query for each of the flow runs obtained from the flow activity entity
# return all flow runs (possibly with limits and time windows) for the given flow
# do a single query for all flow runs specified as a list of flow run id's

One interesting thing to note is that a flow activity entity (record) is an activity of that
flow *for a given day*. In other words, there can be multiple flow activity entities for the
same flow. The flow runs that are returned in the flow activity entity are only for that given
day.

Then the question is, when I click that flow activity record, what flow runs do I expect to
see? It's bit ambiguous, but I think it might make more sense to return only the flow runs
that are referenced in that particular day if we're using the flow activity to render the
landing page.

If we assume that, then (2) is probably not needed for this. Then it leaves us with (1) or
(3). The benefit of (1) is that it fits easily into the existing reader API (getEntity). The
downside is that you may need to make multiple reader calls to retrieve flow runs But normally
the number of flow runs in a day for a given flow should be very small, so it might not be
a big deal.

One hybrid approach may be that the REST API supports URLs based on the list but the web service
code can make multiple reader getEntity() calls. We'd still need to define the form of the
URLs to support that type of queries.

Thoughts?

> [timeline reader] implement support for querying for flows and flow runs
> ------------------------------------------------------------------------
>
>                 Key: YARN-4074
>                 URL: https://issues.apache.org/jira/browse/YARN-4074
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>         Attachments: YARN-4074-YARN-2928.POC.001.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as implementation
of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message