airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriele Angeletti (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (AIRFLOW-2188) Airflow DAG not running on scheduler
Date Wed, 07 Mar 2018 14:03:00 GMT

     [ https://issues.apache.org/jira/browse/AIRFLOW-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gabriele Angeletti closed AIRFLOW-2188.
---------------------------------------
    Resolution: Fixed

> Airflow DAG not running on scheduler
> ------------------------------------
>
>                 Key: AIRFLOW-2188
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2188
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 1.9.0
>         Environment: Ubuntu server 16.04 AMI on Amazon AWS.
>            Reporter: Gabriele Angeletti
>            Priority: Minor
>             Fix For: 1.9.0
>
>
> I have an Airflow system (1.9.0) with the scheduler and the web service up and running.
I wrote a simple DAG that just prints hello world. The code for the DAG is the following:
> {code:python}
> import datetime
> import airflow
> from airflow import DAG
> from airflow.operators.bash_operator import BashOperator
> default_args = dict(
>   owner="blackecho",
>   start_date=datetime.datetime(2018, 3, 4),
>   depends_on_past=False,
>   retries=1,
>   retry_delay=datetime.timedelta(minutes=5),
> )
> dag = DAG(
>   dag_id="HelloWorld_2",
>   default_args=default_args,
>   schedule_interval=datetime.timedelta(minutes=10),
> )
> t1 = BashOperator(
>   task_id="hello",
>   bash_command="echo Hello world && sleep 1",
>   dag=dag,
> )
> if __name__ == "__main__":
>   dag.cli()
> {code}
> The problem is that the scheduler is not working:
> * *airflow test HelloWorld_2 hello 2018-03-04* - this works with logs
> * *airflow run HelloWorld_2 hello 2018-03-04* - this works with logs
> * *airflow backfill HelloWorld_2 -s 2018-03-02 -e 2018-03-03* - this works with logs
> * *airflow trigger_dag HelloWorld_2* - not working, and logs are not generated
> * Click trigger dag from the Web UI - not working, and logs are not generated
> * Scheduled jobs - not working, and logs are not generated
> Here is some of my configs:
> airflow_home = /home/ubuntu/airflow
>  dags_folder = /home/ubuntu/airflow/dags
>  base_log_folder = /home/ubuntu/airflow/logs
>  remote_logging = False
>  remote_log_conn_id =
>  encrypt_s3_logs = False
>  logging_config_class =
>  executor = LocalExecutor
>  task_runner = BashTaskRunner
>  task_log_reader = file.task
> Also, the scheduler is printing *No tasks to consider for execution.*, don't know if
that's related. This is because in the *task_instance* table there are no task instances whose
state is *scheduled*. So the following query, executed in jobs.py (*_find_executable_task_instances*),
returns nothing:
> {code:sql}
> SELECT
>     task_instance.try_number,
>     task_instance.task_id,
>     task_instance.dag_id,
>     task_instance.execution_date,
>     task_instance.start_date,
>     task_instance.end_date,
>     task_instance.duration,
>     task_instance.state,
>     task_instance.max_tries,
>     task_instance.hostname,
>     task_instance.unixname,
>     task_instance.job_id,
>     task_instance.pool,
>     task_instance.queue,
>     task_instance.priority_weight,
>     task_instance.operator,
>     task_instance.queued_dttm,
>     task_instance.pid
> FROM
>     task_instance
> LEFT OUTER JOIN dag_run ON dag_run.dag_id = task_instance.dag_id AND dag_run.execution_date
= task_instance.execution_date
> LEFT OUTER JOIN dag ON dag.dag_id = task_instance.dag_id
> WHERE
>     task_instance.dag_id IN ('HelloWorld_2') AND
>     (dag_run.run_id IS NULL OR dag_run.run_id NOT LIKE 'backfill_%%') AND
>     (dag.dag_id IS NULL OR NOT dag.is_paused) AND
>     task_instance.state IN ('scheduled');
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message