airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ash Berlin-Taylor <ash_apa...@firemirror.com>
Subject Re: Plan to change type of dag_id from String to Number?
Date Thu, 09 Aug 2018 14:29:17 GMT
Since this is a big change that would touch much of the code base, before we do this we need
to see some hard numbers - timing or benchmarks of queries etc.

Also how often do we actually do such a join etc?

-ash

> On 9 Aug 2018, at 13:04, vardanguptacse@gmail.com <mailto:vardanguptacse@gmail.com>
wrote:
> 
> Thanks Ash for your reply, I am aligned with what you're saying. 
> 
> I was not proposing to take away human readable dag_id instead I was thinking, why can't
we create another field like dag_name which will hold this information at all front facing
sites while dag_id is changed to integer, this will help in making joins work faster in metastore.
Though, currently dag_id is indexed but still indexing int (4 bytes) vs varchar(250) are going
to take more index blocks and therefore more look up time. Also, if dag_id is not trivial
to change to int, let it be present and let's introduce another col which is actually integer
in type and let joining happen on this column across all tables.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message