airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vijay Ramesh <vi...@change.org>
Subject Re: Task Instance Results not stored for SubDAG Tasks in 1.8
Date Mon, 20 Mar 2017 04:03:00 GMT
Commented on your Jira ticket but the instances do exist in the
task_instance table, they just have a nested dag_id.

Main dag_id (where the subdag operator task_instance shows up):

airflow=> select * from task_instance where dag_id =
'signatures_by_country' order by execution_date desc limit 1 \x\g
Expanded display is on.
-[ RECORD 1 ]---+-----------------------------------------------------------
task_id         | load_redis
dag_id          | signatures_by_country
execution_date  | 2017-03-19 21:00:00
start_date      | 2017-03-20 03:00:36.480067
end_date        | 2017-03-20 03:18:52.135888
duration        | 1095.655821
state           | success
try_number      | 1
hostname        | airflow-scheduler-0ebce614fbebe5655.pdx.prod.changeeng.org
unixname        | airflow
job_id          | 13381
pool            |
queue           | default
priority_weight | 1
operator        | SubDagOperator
queued_dttm     | 2017-03-20 03:00:33.837705
pid             | 19859


<parent_dag_id>.<sub_dag_operator_task_id> "dag_id" where the sub dag tasks
show up:

airflow=> select * from task_instance where dag_id like
'signatures_by_country.%' order by execution_date desc limit 1 \x\g
Expanded display is on.
-[ RECORD 1 ]---+-----------------------------------------------------------
task_id         | unload
dag_id          | signatures_by_country.load_redis
execution_date  | 2017-03-19 21:00:00
start_date      | 2017-03-20 03:00:43.816051
end_date        | 2017-03-20 03:18:22.80127
duration        | 1058.985219
state           | success
try_number      | 1
hostname        | airflow-scheduler-0ebce614fbebe5655.pdx.prod.changeeng.org
unixname        | airflow
job_id          | 13383
pool            | redshift_pool
queue           | default
priority_weight | 3
operator        | RedshiftUnloadOperator
queued_dttm     |
pid             | 19924


That might help unblock you a bit at least...

On Sun, Mar 19, 2017 at 7:34 PM, Joe Schmid <jschmid@symphonyrm.com> wrote:

> First, congratulations to the contributors on getting 1.8 out and
> especially to Bolke for the Herculean effort!
>
> I noticed one issue in 1.8 that we didn't see in the previous Airflow
> version that we've been running in production and staging. I've filed Jira
> https://issues.apache.org/jira/browse/AIRFLOW-1011 with a test DAG and
> some
> screenshots. Here's the description from the Jira:
>
> ------------------------------------------------------------
> ------------------------------
> Before 1.8, results for tasks executed as a subdag were written as rows to
> task_instances. In Airflow 1.8 only rows for tasks inside the top-level DAG
> (non-subdag tasks) seem to get written to the database.
>
> This results in being unable to check the status of task instances inside
> the subdag from the UI, check the logs for those task instances from the
> UI, etc.
>
> Attached is a simple test DAG that exhibits the issue along with
> screenshots showing the UI differences between v1.8 and v1.7.1.3.
>
> Note that if the DAG is run via backfill from command line (e.g. "airflow
> backfill Test_SubDAG -s 2017-03-18 -e 2017-03-18") the task instances show
> up successfully.
>
> Also, we're using CeleryExecutor and not specifying a different executor
> for our subdags.
> ------------------------------------------------------------
> ------------------------------
>
> If I can provide more info, let me know.
>
> Thanks,
> -Joe
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message