airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Arnold (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (AIRFLOW-2367) High POSTGRES DB CPU utilization
Date Mon, 23 Apr 2018 22:19:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448953#comment-16448953
] 

John Arnold edited comment on AIRFLOW-2367 at 4/23/18 10:18 PM:
----------------------------------------------------------------

[~bolke]  Any suggestions on what metrics or configuration options?  We've been looking
over the database (top 10 queries etc) and there are no surprises that I can see. The top
query by far is for task_instance table and all the conditionals are for indexed columns. 
I went through basically every query in models.py looking for any that are using unindexed
columns, and didn't find any.

I've attached a screenshot of the top 10 queries.

 

We played with our connection pool sizes, thinking that perhaps we were hammering the db with
connections, but that didn't seem to make any difference.  We have the scheduler set with
a connection pool of 20,  two instances of the webserver with connection pool = 5, and all
the celery workers have connection pool = 1.


was (Author: johnarnold):
[~bolke]  Any suggestions on what metrics or configuration options?  We've been looking
over the database (top 10 queries etc) and there are no surprises that I can see. The top
query by far is for task_instance table and all the conditionals are for indexed columns. 
I went through basically every query in models.py looking for any that are using unindexed
columns, and didn't find any.

I've attached a screenshot of the top 10 queries.

 

We played with our connection pool sizes, thinking that perhaps we were hammering the db with
connections, but that didn't seem to make any difference.

> High POSTGRES DB CPU utilization
> --------------------------------
>
>                 Key: AIRFLOW-2367
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2367
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: Airflow 2.0, 1.9.0
>            Reporter: John Arnold
>            Priority: Major
>         Attachments: cpu.png, postgres.png
>
>
> We are seeing steady state 70-90% CPU utilization.  It feels like a missing index kind
of problem, as our TPS rate is really low, I'm not seeing any long running queries, connection
counts are reasonable (low hundreds) and locks also look reasonable (not many exclusive /
write locks)
> We shut down the webserver and it doesn't go away, so it doesn't seem to be in that part
of the code. My guess is either the scheduler has an inefficient query, or the (Celery) executor
code path does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message