hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vrushali C (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3391) Clearly define flow ID/ flow run / flow version in API and storage
Date Wed, 01 Apr 2015 19:38:53 GMT

    [ https://issues.apache.org/jira/browse/YARN-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14391296#comment-14391296

Vrushali C commented on YARN-3391:

I have some semantic level comments.
1) bq.  public static String generateDefaultFlowIdBasedOnAppId(ApplicationId appId) {
return "flow_" + appId.getClusterTimestamp() + "_" + appId.getId();

would be nice to have this string as a static final somewhere. Also the separator defined
as a static final string. 

2) I see that flowRun means flowRunId in this code now. I would actually keep it as flowRunId.
Because an api call like getFlowRun() to me seems that it should return the flow run details,
not just the flow run id.

3) Reposting an earlier reply since jira seems to align it earlier in the thread. bq. Otherwise,
if we use the job name, for example, all the wordcout jobs will belong to one flow then by

Yes, that's exactly what they are. All wordcount jobs belong to the same flow "wordcount"
by that user and each run of the word count is a flow run. In fact, they should not end up
being separate flows. 

> Clearly define flow ID/ flow run / flow version in API and storage
> ------------------------------------------------------------------
>                 Key: YARN-3391
>                 URL: https://issues.apache.org/jira/browse/YARN-3391
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Zhijie Shen
>            Assignee: Zhijie Shen
>         Attachments: YARN-3391.1.patch
> To continue the discussion in YARN-3040, let's figure out the best way to describe the
> Some key issues that we need to conclude on:
> - How do we include the flow version in the context so that it gets passed into the collector
and to the storage eventually?
> - Flow run id should be a number as opposed to a generic string?
> - Default behavior for the flow run id if it is missing (i.e. client did not set it)
> - How do we handle flow attributes in case of nested levels of flows?

This message was sent by Atlassian JIRA

View raw message