falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ajay Yadava (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-1406) Effective time in Entity updates.
Date Wed, 23 Nov 2016 14:29:58 GMT

    [ https://issues.apache.org/jira/browse/FALCON-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15690243#comment-15690243
] 

Ajay Yadava commented on FALCON-1406:
-------------------------------------

{quote} a lot of features developed in falcon took the immutability of *entity definition*
for past instances as given {quote}

Rerunning old instances doesn't affect that. New entities create new instances and don't affect
"old instances", though they might overwrite the result of other entities.


{quote}This is a much cleaner way of retaining history than the current scheme {quote}
I understand and agree with the motivation, but IMHO, this approach pollutes history. The
fact that the overlapping instances ran and had a status and are selectively nuked with this
change, is seemingly clean but is actually creating more problems than solving. A better approach
in retaining history will be to create entity versioning.

The workaround approach that you have suggested now, is the one which doesn't leave "mess"
(defunct and similarly named entities) so you don't need a cleanup. However, there are a lot
of other cases where there will be such "mess" and a tool can definitely be built to highlight
such entities, older than a given time range, and hence can be deleted. It will also be useful
if someone takes the approach of not deleting and recreating the entity but to update the
entity and reprocess old instances with a backfill job. Both approaches have their own pros
and cons and none is ideal.


> Effective time in Entity updates.
> ---------------------------------
>
>                 Key: FALCON-1406
>                 URL: https://issues.apache.org/jira/browse/FALCON-1406
>             Project: Falcon
>          Issue Type: New Feature
>            Reporter: sandeep samudrala
>            Assignee: sandeep samudrala
>         Attachments: FALCON-1406-initial.patch, effective_time_in_entity_updates.pdf
>
>
> Effective time with entity updates needs to be provided even with past time too. There
was effective time capability provided in the past which gives the functionality to set an
effective time for an entity with only current or future time(now + delay), which could not
solve all the issues. 
> Following are few scenarios which would require effective time to be available with time
back in past.
> a) New code being deployed for an incompatible input data set which would leave instances
with old code and new data.
> b) Bad code being pushed for which, the entity should be able to go back in time to replay(rerun)
with new code.
> c) Orchestration level changes(good/bad) would need functionality to go back in time
to start with.
> For reference: Linking all the Jiras that have been worked upon around effective time
.
> https://issues.apache.org/jira/browse/FALCON-374
> https://issues.apache.org/jira/browse/FALCON-297



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message