falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venkatesh Seetharam (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-287) Record lineage information in post processing
Date Wed, 05 Mar 2014 19:20:43 GMT

    [ https://issues.apache.org/jira/browse/FALCON-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13921251#comment-13921251
] 

Venkatesh Seetharam commented on FALCON-287:
--------------------------------------------

[~shaik.idris], thanks for taking time to review. 

bq. May be I missed the actual use-case of lineage, why do we need to persist it and who are
the consumers of this information.
Typical use cases for lineage are impact analysis, retracing how a feed was generated to its
source, etc.

bq. Secondly, instead of FalconPostProcessing storing this data on HDFS, which might further
slowdown user workflow
This is a tradeoff. Simplicity vs Efficiency. Serializing to JSON and writing it to a file
on HDFS should not be slow and the current implementation is also quite inefficient IMO. However,
the downside to this issue is the NN namespace problem but we are cleaning this up proactively.


bq. May be I got the intent of storing this, but what all additional data we require for each
feed.
Things like sizes, scheme, etc. I do not know it all at this time but this framework will
not need any change to the message passing structure and will allow scheme evolution and will
not bleed. Its contained in LineageRecorder.

> Record lineage information in post processing
> ---------------------------------------------
>
>                 Key: FALCON-287
>                 URL: https://issues.apache.org/jira/browse/FALCON-287
>             Project: Falcon
>          Issue Type: Sub-task
>    Affects Versions: 0.5
>            Reporter: Venkatesh Seetharam
>            Assignee: Venkatesh Seetharam
>              Labels: lineage
>         Attachments: FALCON-287-v1.patch, FALCON-287-v2.patch, FALCON-287.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message