airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [airflow] kaxil commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability
Date Fri, 06 Sep 2019 03:55:41 GMT
kaxil commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for
webserver scalability
URL: https://github.com/apache/airflow/pull/5743#issuecomment-528697431
 
 
   Some more testing:
   
   For another trial, I am completely removing the need to load JSON from a str by using JSON
columns instead of str columns.
   
   Just did some benchmarks on my local machine and it is very impressive. Not having to loads
json from str and vice-versa seemed to have halved the time needed to de-serialized dags.
   
   For 100 Dags,
   
   Parsing from file: 19.6 s (14.6 s - Best run after 5 runs)
   Dag Serialisation with `json.loads`: 26.5 s (17.8 s - Best Run after 5 runs)
   Dag Serialisation with `ujson`: 25.8 s (17.3 s - Best Run after 5 runs)
   Dag Serialisation with *Json Columns* (removed converting str to json & vice-versa):
12.1 s (6.98 s ± 169 ms - Best Run after 5 runs)
   
   Need to however tests this results on our staging cluster too as it can be very different.
Will do it tomorrow. has been a long day fighting with json libraries - ~5AM here :sleeping:

   
   Postgres Jsonb might be even quicker ! Will try that out to tomorrow

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message