airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-3314) The lineage automatic inlets feature does not work as described.
Date Thu, 08 Nov 2018 03:31:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16679259#comment-16679259
] 

ASF GitHub Bot commented on AIRFLOW-3314:
-----------------------------------------

reubenvanammers opened a new pull request #4156: [AIRFLOW-3314] Changed auto inlets feature
to work as described.
URL: https://github.com/apache/incubator-airflow/pull/4156
 
 
   Currently, the automatic inlets feature is described to be able to skip
   operations that produce no outlets. This changes the behaviour of the
   auto inlets feature in order so that the presence of non outlet
   producing operations are effectively ignored while still receiving all
   outlets from upstream tasks.
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
     - https://issues.apache.org/jira/browse/AIRFLOW-XXX
     - In case you are fixing a typo in the documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI changes:
   
   Uses a tree search algorithm in order to find upstream outlets to be used as inlets. Provides
expected behaviour as described in the previous documentation to receive outlets. 
   
   @bolkedebruin, do you have any input on these changes?
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely
good reason:
   Adds unit tests to tests/lineage/test_lineage.py
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I have squashed
multiple commits if they address the same issue. In addition, my commits follow the guidelines
from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes how to use
it.
     - When adding new operators/hooks/sensors, the autoclass documentation generation needs
to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   
   Sorry if I'm breaking any guidelines, this is my first OSS pull request. 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> The lineage automatic inlets feature does not work as described.
> ----------------------------------------------------------------
>
>                 Key: AIRFLOW-3314
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3314
>             Project: Apache Airflow
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Reuben van Ammers
>            Priority: Minor
>         Attachments: test_lineage_broken.py
>
>
> Currently, this is the description of the arguments to the prepare lineage wrapper regarding
inlets in airflow/lineage/__init__.py:
> inlets can be:
>  "auto" -> picks up any outlets from direct upstream tasks that have outlets
>  defined, as such that if A -> B -> C and B does not have outlets but A does,
>  these are provided as inlets.
> This implies that non state producing tasks should have no effect on the behaviour for
inlets and outlets, which is desirable to easily add operators that don't change state of
files (such as for tracking/communication). 
>  
> This is not the current behaviour. This can be seen by changing the test case in tests/lineage/test_lineage.py
to use the auto feature on operation 5 to use auto. Given the description, one would expect
1 inlet file, the the file produced from op3. However, as can be seen in the attatched broken
test, this is not the case, and the presence of the non-state affecting operator breaks the
test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message