falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venkatesh Seetharam (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-286) Capture information in process entity about the user workflow
Date Wed, 05 Mar 2014 18:38:43 GMT

    [ https://issues.apache.org/jira/browse/FALCON-286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13921178#comment-13921178
] 

Venkatesh Seetharam commented on FALCON-286:
--------------------------------------------

bq. How is this info used?
These allow the user to name and version the user workflow and are stored as properties of
the process node in the entity graph. 

bq. The workflow properties that are set and used by falcon, can we have a convention for
the property name( and document it)? If user also defines the same property, falcon post-processing
will fail
This is true for entity names as well. If a user picks an name for feed or process, it should
be the same. Typically, its the user workflow name like impression-click-join-wf etc.

bq. initially replication was designed for same input/output path, but now make sense to have
output event
This is used only in the post processing and lineage.

bq. In process mapper, value is coming from XSD, but for feed it is hardcoded here, should
we follow the same convention and add falcon as another engineType
This is only to drive lineage for replicated and evicted feeds. Which workflow replicated
the data - its falcon. Its not an option for the end user. May be it can be assumed in lineage
code as well but then the Arg hell in post processing was a limiting factor I think.

bq. this seems to be bug fix after making this as optional.
There were quite a few bugs. It would never have worked for a process with no outputs. Added
a unit test as well.

bq. This will create a problem, basically user can define in a process, for a same feed multiple
inputEvents/outputEvents with different ranges.
How will this create a problem? I'm only adding corresponding feed names for each input which
will be used to create relationships for the feed entity from instance in the graph.

Is this a +1? Shall I commit this?

> Capture information in process entity about the user workflow
> -------------------------------------------------------------
>
>                 Key: FALCON-286
>                 URL: https://issues.apache.org/jira/browse/FALCON-286
>             Project: Falcon
>          Issue Type: Sub-task
>    Affects Versions: 0.5
>            Reporter: Venkatesh Seetharam
>            Assignee: Venkatesh Seetharam
>              Labels: lineage
>         Attachments: FALCON-286-v1.patch, FALCON-286-v2.patch, FALCON-286.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message