airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Yang <yrql...@gmail.com>
Subject Re: [2.0 spring cleaning] Require unique conn_id
Date Sun, 14 Apr 2019 07:36:14 GMT
Yup unfortunately we Airbnb are relaying on the "feature" for some load
balanching and also something like sensing partitions from 2 clusters at
the same time( yup it is ugly). And at the same time we got bitten by
having duplicate connections while one has outdated info.

I think it does make sense to have a unique key on  (conn_id, conn_type)
while allowing multiple records being stored inside. And also strongly
agree to expose more info about connection in the UI/log( think someone in
our team is making some small change belong to that category), as sometimes
people like to know more about connections but they don't have access to
the connection page.

Cheers,
Kevin Y

On Sat, Apr 13, 2019 at 11:59 PM airflowuser
<airflowuser@protonmail.com.invalid> wrote:

> It can get more confusing because airflow allow to create two connection
> with same conn_id but different conn_type
> https://issues.apache.org/jira/browse/AIRFLOW-2784
>
>
> Sent with ProtonMail Secure Email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Sunday, April 14, 2019 12:22 AM, Maxime Beauchemin <
> maximebeauchemin@gmail.com> wrote:
>
> > People may rely on this feature for [poor man's] load balancing though, I
> > forgot what the exact use case was but used this at Airbnb at some point.
> >
> > Maybe the solution is to make the UI/UX/log output much more clear around
> > this. Making the CLI log more clear should be really easy to do, web
> server
> > might be a little more complicated but nothing too complicated.
> >
> > Max
> >
> > On Fri, Apr 12, 2019 at 7:51 AM James Meickle
> > jmeickle@quantopian.com.invalid wrote:
> >
> > > Airflow fetches connections by name, but doesn't enforce unique names.
> My
> > > team got bit by this, since it's very unexpected behavior for most
> types of
> > > data entry. The reason for this behavior is explained in the docs:
> > > "Many connections with the same conn_id can be defined and when that
> is the
> > > case, and when the hooks uses the get_connection method from BaseHook,
> > > Airflow will choose one connection randomly, allowing for some basic
> load
> > > balancing and fault tolerance when used in conjunction with retries."
> > > I think this is very non-intuitive UX. If we even want to support this
> > > feature within Airflow - and I don't think that is a given - it would
> make
> > > much more sense to require a unique (conn_id, conn_type) but allow
> storing
> > > multiple related records. This wouldn't be a huge data modeling
> change, but
> > > would require changing the web UI to appear as a form with subforms.
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message