airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxime Beauchemin <maximebeauche...@gmail.com>
Subject Re: Role Based Access Control for Airflow UI
Date Thu, 20 Jul 2017 16:31:17 GMT
Sounds awesome, count me in!

* check out the prototype in my fork, I went far enough to hit some
hurdles, try different workarounds. I hooked up the Airflow Bootstrap
template too so that we feel at home in this new UI
* using a single `id` field is a requirement for FAB that airflow doesn't
respect (composite pks), either we add the feature to support that in FAB,
or we align on the Airflow side and modify the models and add a migration
script. This upgrade would require downtime and might be annoying to the
Airflow community, but could help with db performance a bit (smaller
index)... I probably could be convinced either way but I'm leaning on
improving FAB
* I'm a maintainer for FAB so I can help get stuff through there
* React is in limbo at the ASF for licensing reasons, so no React at least
for now
* npm/webpack/ES6, javascript only in `.js` files
* I vote for eslint + eslint-config-airbnb as a set of linting rules for JS
* Keep out of apache (for now), this new app ships as its own pypi package
`airflow-webserver`, have a period of overlap (maintaining 2 web apps)
before ripping out `airflow/www` from the core package
* You need to get in touch with Marty Kausas, an intern at Airbnb who's
been working on a Flask blueprint for improved, more personalized views on
DAGs that we were planning on merging into the main branch eventually. Some
of Marty's idea and code could be merged into this effort.

These are ideas on how I would proceed personally on this but definitely
everything here is up for discussion.

Let's meet physically at either WePay or Airbnb. Folks from the community,
let us know on this thread if you want to be part of this effort, we'll be
happy to include you.

Thanks,

Max

On Wed, Jul 19, 2017 at 7:33 PM, Joy Gao <joyg@wepay.com> wrote:

> Hey everyone,
>
> I recently transferred to Data Infra team here at WePay to focus on
> Airflow-related initiatives.
>
> Given the RBAC design is mostly hashed out, I'm happy to get this feature
> off the ground for Q3, starting with converting Airflow to Fab, if there
> are no objections.
>
> Cheers,
> Joy
>
> On Thu, Jun 29, 2017 at 7:32 AM, Gurer Kiratli <
> gurer.kiratli@airbnb.com.invalid> wrote:
>
> > Hey all,
> >
> > We talked about this internally. We would like to work on this feature
> but
> > given the immediate priorities we are not going to be working on it in
> Q3.
> > Comes end of Q3 we will reevaluate. Likely scenario is we can work on it
> > late Q4 or Q12018.
> >
> > Cheers,
> >
> > Gurer
> >
> > On Tue, Jun 27, 2017 at 8:08 AM, Chris Riccomini <criccomini@apache.org>
> > wrote:
> >
> > > I think FAB sounds like the right approach. Waiting to hear back with
> > notes
> > > on AirBNB H2 discussion to see if they want to take this up.
> > >
> > > @Gurer, any idea when this will happen?
> > >
> > > On Thu, Jun 22, 2017 at 1:00 AM, Bolke de Bruin <bdbruin@gmail.com>
> > wrote:
> > >
> > > > One downside I see from FAB is that is does not do Business Role
> > mapping
> > > > to FAB role. I would prefer to create groups in IPA/LDAP/AD and have
> > > those
> > > > map to FAB roles instead of needing to manage that in FAB.
> > > >
> > > > B.
> > > >
> > > > > On 22 Jun 2017, at 09:36, Bolke de Bruin <bdbruin@gmail.com>
> wrote:
> > > > >
> > > > > Hi Guys,
> > > > >
> > > > > Thanks for putting the thinking in! It is about time that we get
> this
> > > > moving.
> > > > >
> > > > > The design looks pretty sound. One can argue about the different
> > roles
> > > > that are required, but that will be situation dependent I guess.
> > > > >
> > > > > Implementation wise I would argue together with Max that FAB is a
> > > better
> > > > or best fit. The ER model that is being described is pretty much a
> copy
> > > of
> > > > a normal security model. So a reimplementation of that is 1)
> > significant
> > > > duplication of effort and 2) bound to have bugs that have been solved
> > in
> > > > the other framework. Moreover, FAB does have integration out of the
> box
> > > > with some enterprisey systems like IPA, ActiveDirectory, and LDAP.
> > > > >
> > > > > So while you argue that using FAB would increase the scope of the
> > > > proposal significantly, but I think that is not true. Using FAB would
> > > allow
> > > > you to focus on what kind of out-of-the-box permission sets and roles
> > we
> > > > would need and maybe address some issues that FAB lacks (maybe how to
> > > deal
> > > > with non web access - ie. in DAGs, maybe Kerberos, probably how to
> deal
> > > > with API calls that are not CRUD). Implementation wise it probably
> > > > simplifies what we need to do. Maybe - using Max’s early POC as an
> > > example
> > > > - we can slowly move over?
> > > > >
> > > > > On a side note: Im planning to hire 2-3 ppl to work on Airflow
> coming
> > > > year. Improvement of Security, Enterprise Integration, Revamp UI are
> on
> > > the
> > > > todo list. However, this is not confirmed yet as business priorities
> > > might
> > > > change.
> > > > >
> > > > > Bolke.
> > > > >
> > > > >
> > > > >> On 15 Jun 2017, at 21:45, kalpesh dharwadkar <
> > > > kalpeshdharwadkar@gmail.com> wrote:
> > > > >>
> > > > >> @Dan:
> > > > >>
> > > > >> Thanks for your feedback. I will remove the REFRESH_DAG
> permission.
> > > > >>
> > > > >> @Max:
> > > > >>
> > > > >> Thanks for your response.
> > > > >>
> > > > >> The scope of my proposal was just to add RBAC security feature
to
> > > > Airflow
> > > > >> without replacing any existing frameworks.
> > > > >>
> > > > >> I understand that adopting FAB would serve Airflow better moving
> > > > forward,
> > > > >> however porting Airflow to using FAB significantly increases
the
> > scope
> > > > of
> > > > >> the proposal and I don't have the time and expertise to carry
out
> > the
> > > > tasks
> > > > >> in the extended scope.
> > > > >>
> > > > >> Hence, I'm curious to know if there's a plan for Airflow to
> migrate
> > to
> > > > FAB
> > > > >> this year?
> > > > >>
> > > > >> - Kalpesh
> > > > >>
> > > > >> On Mon, Jun 12, 2017 at 6:16 PM, Maxime Beauchemin <
> > > > >> maximebeauchemin@gmail.com> wrote:
> > > > >>
> > > > >>> It would be nice to go with a framework for this. I did some
> > > > >>> experimentation using FlaskAppBuilder to go in this direction.
It
> > > > provides
> > > > >>> auth on different authentication backends out of the box
(oauth,
> > > > openid,
> > > > >>> ldap, registration, ...), generates perms for each view that
has
> an
> > > > >>> @has_access decorator, generates at set of perms for each
ORM
> model
> > > > (show,
> > > > >>> edit, delete, add, ...) and enforces it in the CRUD views
as well
> > as
> > > > in the
> > > > >>> generated REST api that you get for free as a byprdoduct
of
> > deriving
> > > > FAB's
> > > > >>> models (essentially it's SqlAlchemy with a layer on top).
> > > > >>>
> > > > >>> I started a POC on FAB here a while ago:
> > > > >>> https://github.com/mistercrunch/airflow_webserver at the
time my
> > > main
> > > > >>> motivation was the free/instantaneous REST api.
> > > > >>>
> > > > >>> I think FAB is a decent fit as the porting should be fairly
> > > > straightforward
> > > > >>> (moving the flask views over and deprecating Flask-Admin
in favor
> > of
> > > > FAB's
> > > > >>> crud) though there was a few blockers. From memory I think
FAB
> > didn't
> > > > like
> > > > >>> the compound PKs we use in some of the Airflow models. We'd
have
> to
> > > > either
> > > > >>> write a db migration script on the Airflow side, or add support
> for
> > > > >>> compound keys to FAB (I recently became a maintainer of the
> > project,
> > > > so I
> > > > >>> could help with that)
> > > > >>>
> > > > >>> The only downside of FAB is that it's not as mature as something
> > like
> > > > >>> Django, but porting to Django would surely be much more work.
> > > > >>>
> > > > >>> Then there's the flask-security suite, but that looks like
a bit
> > of a
> > > > >>> patchwork to me, I guess we can pick and choose which we
want to
> > use.
> > > > >>>
> > > > >>> Max
> > > > >>>
> > > > >>> On Mon, Jun 12, 2017 at 12:50 PM, Dan Davydov <
> > > > >>> dan.davydov@airbnb.com.invalid> wrote:
> > > > >>>
> > > > >>>> Looks good to me in general, thanks for putting this
together!
> > > > >>>>
> > > > >>>> I think the ability to integrate with external RBAC systems
like
> > > LDAP
> > > > is
> > > > >>>> important (i.e. the Airflow DB should not be decoupled
with the
> > RBAC
> > > > >>>> database wherever possible).
> > > > >>>>
> > > > >>>> I wouldn't be too worried about the permissions about
refreshing
> > > > DAGs, as
> > > > >>>> far as I know this functionality is no longer required
with the
> > new
> > > > >>>> webservers which reload state periodically, and will
certainly
> be
> > > > removed
> > > > >>>> when we have a better DAG consistency story.
> > > > >>>>
> > > > >>>> I think it would also be good to think about this
> > > > proposal/implementation
> > > > >>>> and how it applied in the API-driven world (e.g. when
webserver
> > hits
> > > > APIs
> > > > >>>> like /clear on behalf of users instead of running commands
> against
> > > the
> > > > >>>> database directly).
> > > > >>>>
> > > > >>>> On Mon, Jun 12, 2017 at 11:12 AM, Bolke de Bruin <
> > bdbruin@gmail.com
> > > >
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> Will respond but im traveling at the moment. Give
me a few
> days.
> > > > >>>>>
> > > > >>>>> Sent from my iPhone
> > > > >>>>>
> > > > >>>>>> On 12 Jun 2017, at 13:39, Chris Riccomini <
> > criccomini@apache.org>
> > > > >>>> wrote:
> > > > >>>>>>
> > > > >>>>>> Hey all,
> > > > >>>>>>
> > > > >>>>>> Checking in on this. We spent a good chunk of
time thinking
> > about
> > > > >>> this,
> > > > >>>>> and
> > > > >>>>>> want to move forward with it, but want to make
sure we're all
> on
> > > the
> > > > >>>> same
> > > > >>>>>> page.
> > > > >>>>>>
> > > > >>>>>> Max? Bolke? Dan? Jeremiah?
> > > > >>>>>>
> > > > >>>>>> Cheers,
> > > > >>>>>> Chris
> > > > >>>>>>
> > > > >>>>>> On Thu, Jun 8, 2017 at 1:49 PM, kalpesh dharwadkar
<
> > > > >>>>>> kalpeshdharwadkar@gmail.com> wrote:
> > > > >>>>>>
> > > > >>>>>>> Hello everyone,
> > > > >>>>>>>
> > > > >>>>>>> As you all know, currently Airflow doesn’t
have a built-in
> Role
> > > > >>> Based
> > > > >>>>>>> Access Control(RBAC) capability.  It does
provide very
> limited
> > > > >>>>>>> authorization capability by providing admin,
data_profiler,
> and
> > > > user
> > > > >>>>> roles.
> > > > >>>>>>> However, associating these roles to authenticated
identities
> is
> > > not
> > > > >>> a
> > > > >>>>>>> simple effort.
> > > > >>>>>>>
> > > > >>>>>>> To address this issue, I have created a design
proposal for
> > > > building
> > > > >>>>> RBAC
> > > > >>>>>>> into Airflow and simplifying user access
management via the
> > > Airflow
> > > > >>>> UI.
> > > > >>>>>>>
> > > > >>>>>>> The design proposal is located at https://cwiki.apache.org/
> > > > >>>>>>> confluence/display/AIRFLOW/Airflow+RBAC+proposal
> > > > >>>>>>>
> > > > >>>>>>> Any comments/questions/feedback are much
appreciated.
> > > > >>>>>>>
> > > > >>>>>>> Thanks
> > > > >>>>>>> Kalpesh
> > > > >>>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >
> > > >
> > > >
> > >
> >
>
>
>
> --
>
> Joy Gao
> Software Engineer
> 350 Convention Way, Suite 200
> Redwood City, CA 94063
> Mobile:  669-224-9305
>
> Payments partner to the platform economy
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message