airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxime Beauchemin <maximebeauche...@gmail.com>
Subject Re: Subdag failure
Date Thu, 01 Jun 2017 15:50:55 GMT
>From memory, I'm pretty sure `--local` simply means "run locally without
going through an executor" (as opposed to "run through LocalExecutor" that
you assume). For the record, there are 3 levels of `airflow run`, let's go
deeper:

1. `airflow run` which typically is only called through a user using the
CLI, and talks to the executor
2. `airflow run --local` which is executed locally, "on the other side" of
the executor, and wraps level 3 (explained bellow) while managing
heartbeats and looking for external kill signals. If you us the CGROUP or
impersonation feature, I believe this has to run with elevated priviledges.
Workers typically receive and execute these commands.
3. `airflow run --raw` which actually runs the operator's execute method,
impersonation and CGROUPs restrictions apply here

This is a bit confusing and it could help to make `--local` and `--raw`
more "private", as in for system/internal use only.

I also just noticed that the `--local` for `airflow run` is wrong in the
docs as it's re-using the backfill help text, I'll send a PR to fix the
docs.

Max

On Thu, Jun 1, 2017 at 1:31 AM, Zsolt Tóth <toth.zsolt.bme@gmail.com> wrote:

> Hi devs,
>
> I got an issue with airflow subdags, that I can't debug or trace.
> A specific subdag fails nondeterministically (usually it fails) when it's
> scheduled by Airflow, and runs fine when the task is re-executed manually.
>
> For the first subtask in the dag the logs can't be found by airflow, but
> that may be a different issue. If the subtask succeeds, Airflow finds the
> logs, too.
>
> *** Log file isn't local.…*** Fetching here:
> http://None:8793/log/<parent_dag_id.subdag_id>/<task_id>/
> 2017-05-28T00:45:00…***
> Failed to fetch log file from worker.
>
> If I look into the logs, I can't find any specific details. The
> SubdagOperator's logs:
>
> [2017-05-30 14:06:06,117] {models.py:1219} INFO - Executing
> <Task(SubDagOperator): <subdag_id>> on 2017-05-18 00:45:00
> [2017-05-30 14:06:06,247] {base_executor.py:36} INFO - Adding to queue:
> airflow run <parent_dag_id.subdag_id> <task_id> 2017-05-18T00:45:00 --local
> -sd DAGS_FOLDER/parent_dag_file.py
> [2017-05-30 14:06:11,151] {sequential_executor.py:26} INFO - Executing
> command: airflow run <parent_dag_id.subdag_id> <task_id>
> 2017-05-18T00:45:00 --local -sd DAGS_FOLDER/parent_dag_file.py
> [2017-05-30 14:06:12,119] {sequential_executor.py:33} ERROR - Failed to
> execute task Command 'airflow run <parent_dag_id.subdag_id> <task_id>
> 2017-05-18T00:45:00 --local -sd DAGS_FOLDER/parent_dag_file.py ' returned
> non-zero exit status 1:
> [2017-05-30 14:06:12,125] {jobs.py:924} ERROR - Task instance
> ('<parent_dag_id.subdag_id>', '<task_id>', datetime.datetime(2017, 5, 18,
> 0, 45)) failed
> [2017-05-30 14:06:12,125] {jobs.py:997} INFO - [backfill progress] |
> waiting: 2 | succeeded: 0 | kicked_off: 1 | failed: 1 | skipped: 0 |
> deadlocked: 0
> [2017-05-30 14:06:16,150] {jobs.py:997} INFO - [backfill progress] |
> waiting: 0 | succeeded: 0 | kicked_off: 1 | failed: 1 | skipped: 0 |
> deadlocked: 2
> [2017-05-30 14:06:16,182] {models.py:1286} ERROR -
> ---------------------------------------------------
> Some task instances failed:
> set([('<parent_dag_id.subdag_id>', '<task_id>', datetime.datetime(2017, 5,
> 18, 0, 45))])
> ...
>
> Regarding the subdag's tasks, only the first task contains any logs, and
> only this single line:
> [2017-05-30 14:06:11,980] {models.py:154} INFO - Filling up the DagBag from
> <parent_dag_file>
>
> I suspected that the issue may be that Airflow is configured to use the
> SequentialExecutor, but the airflow run command in the logs has the --local
> switch, which I think forces the LocalExecutor. I tried to change this to
> SequentialExecutor by setting the SubdagOperator's executor field, but the
> --local switch is still there.
>
> I'm using Airflow 1.7.1.3.
>
> Thanks,
> Zsolt
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message