airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edgar Rodriguez (JIRA)" <j...@apache.org>
Subject [jira] [Work started] (AIRFLOW-1495) TaskInstanceModelView using unindexed column for order_by on default
Date Tue, 15 Aug 2017 16:59:00 GMT

     [ https://issues.apache.org/jira/browse/AIRFLOW-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Work on AIRFLOW-1495 started by Edgar Rodriguez.
------------------------------------------------
> TaskInstanceModelView using unindexed column for order_by on default
> --------------------------------------------------------------------
>
>                 Key: AIRFLOW-1495
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1495
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: webserver
>            Reporter: Edgar Rodriguez
>            Assignee: Edgar Rodriguez
>
> Using unindexed columns for ordering query results triggers sorting in the database,
which makes for slower queries.
> Solution is to use an indexed column or add an index on the desired columns if it doesn't
exist, see: https://github.com/apache/incubator-airflow/blob/master/airflow/www/views.py#L2489
> When checking for indices on task_instance, none is found for {{job_id}}:
> {code}
> mysql> show index from task_instance;
> +---------------+------------+--------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
> | Table         | Non_unique | Key_name     | Seq_in_index | Column_name     | Collation
| Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
> +---------------+------------+--------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
> | task_instance |          0 | PRIMARY      |            1 | task_id         | A    
    |          71 |     NULL | NULL   |      | BTREE      |         |               |
> | task_instance |          0 | PRIMARY      |            2 | dag_id          | A    
    |          80 |     NULL | NULL   |      | BTREE      |         |               |
> | task_instance |          0 | PRIMARY      |            3 | execution_date  | A    
    |       30056 |     NULL | NULL   |      | BTREE      |         |               |
> | task_instance |          1 | ti_dag_state |            1 | dag_id          | A    
    |          18 |     NULL | NULL   |      | BTREE      |         |               |
> | task_instance |          1 | ti_dag_state |            2 | state           | A    
    |          28 |     NULL | NULL   | YES  | BTREE      |         |               |
> | task_instance |          1 | ti_pool      |            1 | pool            | A    
    |           1 |     NULL | NULL   | YES  | BTREE      |         |               |
> | task_instance |          1 | ti_pool      |            2 | state           | A    
    |           3 |     NULL | NULL   | YES  | BTREE      |         |               |
> | task_instance |          1 | ti_pool      |            3 | priority_weight | A    
    |          16 |     NULL | NULL   | YES  | BTREE      |         |               |
> | task_instance |          1 | ti_state_lkp |            1 | dag_id          | A    
    |          18 |     NULL | NULL   |      | BTREE      |         |               |
> | task_instance |          1 | ti_state_lkp |            2 | task_id         | A    
    |          82 |     NULL | NULL   |      | BTREE      |         |               |
> | task_instance |          1 | ti_state_lkp |            3 | execution_date  | A    
    |       28037 |     NULL | NULL   |      | BTREE      |         |               |
> | task_instance |          1 | ti_state_lkp |            4 | state           | A    
    |       29354 |     NULL | NULL   | YES  | BTREE      |         |               |
> | task_instance |          1 | ti_state     |            1 | state           | A    
    |           3 |     NULL | NULL   | YES  | BTREE      |         |               |
> +---------------+------------+--------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
> {code}
> So, when executing a query similar to the one for the view paging, the query explain
shows the required filesort:
> {code}
> mysql> explain select * from task_instance order by job_id limit 500 offset 500;
> +----+-------------+---------------+------------+------+---------------+------+---------+------+-------+----------+----------------+
> | id | select_type | table         | partitions | type | possible_keys | key  | key_len
| ref  | rows  | filtered | Extra          |
> +----+-------------+---------------+------------+------+---------------+------+---------+------+-------+----------+----------------+
> |  1 | SIMPLE      | task_instance | NULL       | ALL  | NULL          | NULL | NULL
   | NULL | 30119 |   100.00 | Using filesort |
> +----+-------------+---------------+------------+------+---------------+------+---------+------+-------+----------+----------------+
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message