airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olivier Girardot (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AIRFLOW-699) Dag can't be triggered at the same second due to constraint on dag_id + execution_date
Date Wed, 14 Dec 2016 13:51:58 GMT
Olivier Girardot created AIRFLOW-699:
----------------------------------------

             Summary: Dag can't be triggered at the same second due to constraint on dag_id
+ execution_date
                 Key: AIRFLOW-699
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-699
             Project: Apache Airflow
          Issue Type: Bug
          Components: db
    Affects Versions: Airflow 1.7.1.3
            Reporter: Olivier Girardot


We have a system that triggers Dags when several files arrive in HDFS, we have crafted a correct
run_id to trace the trigger but since the schema of dag_run table is : 

{code:sql}
CREATE TABLE `dag_run` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `dag_id` varchar(250) DEFAULT NULL,
  `execution_date` datetime DEFAULT NULL,
  `state` varchar(50) DEFAULT NULL,
  `run_id` varchar(250) DEFAULT NULL,
  `external_trigger` tinyint(1) DEFAULT NULL,
  `conf` blob,
  `end_date` datetime DEFAULT NULL,
  `start_date` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `dag_id` (`dag_id`,`execution_date`),
  UNIQUE KEY `dag_id_2` (`dag_id`,`run_id`)
) ENGINE=InnoDB AUTO_INCREMENT=2998 DEFAULT CHARSET=latin1 {code}

We end up with DuplicateEntry exception
{noformat}
sqlalchemy.exc.IntegrityError: (_mysql_exceptions.IntegrityError) (1062, "Duplicate entry
'my-job-2016-12-13 19:52:33' for key 'dag_id'") [SQL: u'INSERT INTO dag_run (dag_id, execution_date,
start_date, end_date, state, run_id, external_trigger, conf) VALUES (%s, %s, now(), %s, %s,
%s, %s, %s)'] [parameters: ('my-job', datetime.datetime(2016, 12, 13, 19, 52, 33, 210790),
None, u'running', 'my-job-custom-run-id_2016-12-13T19:52:32.291_785a2860-c622-47f6-a29c-4c6394f931fa',
1, "...")
{noformat}

Is there any need for this constraint ? The "datetime" precision is problematic for us because
it's usual that some dags get triggered at the same second.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message