incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jakob Homan <jgho...@gmail.com>
Subject Re: [VOTE] Accept Airflow into the Incubator
Date Fri, 25 Mar 2016 05:05:56 GMT
+1 (binding)

-jakob 

> On Mar 24, 2016, at 8:20 PM, Chris Riccomini <criccomini@apache.org> wrote:
> 
> +1 non-binding :)
> 
> On Thu, Mar 24, 2016 at 8:00 PM, Siddharth Anand <sanand@agari.com.invalid>
> wrote:
> 
>> Following the discussion earlier:
>>    https://s.apache.org/AirflowDiscussion
>> 
>> I would like to call a VOTE for accepting Airflow as a new incubator
>> project.
>> 
>> The proposal is available at:
>> https://wiki.apache.org/incubator/AirflowProposal
>> 
>> The proposal is also included at the bottom of this email.
>> 
>> Vote is open until at least Tues, 29 March 2016, 23:59:00 PDT
>> [ ] +1 accept Airflow into the Apache Incubator
>> [ ] ±0
>> [ ] -1 because...
>> 
>> +1 (non-binding)
>> 
>> Thanks,
>> -s (Sid)
>> 
>> 
>> == Abstract ==
>> 
>> Airflow is a workflow automation and scheduling system that can be
>> used to author and manage data pipelines.
>> 
>> == Proposal ==
>> 
>> Airflow provides a system for authoring and managing workflows a.k.a.
>> data pipelines a.k.a. DAGs (Directed Acyclic Graphs). The developer
>> authors DAGs in Python using an Airflow-provided framework. He/She
>> then executes the DAG using Airflow’s scheduler or registers the DAG
>> for event-based execution. A web-based UI provides the developer with
>> a range of options for managing and viewing his/her data pipelines.
>> Background
>> 
>> Airflow was developed at Airbnb to enable easier authorship and
>> management of DAGs than were possible with existing solutions such as
>> Oozie and Azkaban. For starters, both Oozie and Azkaban rely on one or
>> more XML or property files to be bundled together to define a
>> workflow. This separation of code and config can present a challenge
>> to understanding the DAG - in Azkaban, a DAG’s structure is reflected
>> by its file system tree and one can find himself/herself traversing
>> the file system when inspecting or changing the structure of the DAG.
>> Airflow workflows, on the other hand, are simply and elegantly defined
>> in Python code, often a single file. Airflow merges the powerful
>> Web-based management aspects of projects like Azkaban and Oozie with
>> the simplicity and elegance of defining workflows in Python. Airflow,
>> less than a year old in terms of its Open Source launch, is currently
>> used in production environments in more than 30 companies and boasts
>> an active contributor list of more than 100 developers, the vast
>> majority of which (>95%) are outside of Airbnb.
>> 
>> We would like to share it with the ASF and begin developing a
>> community of developers and users within Apache.
>> 
>> == Rationale ==
>> 
>> Many organizations (>30) already benefit from running Airflow to
>> manage data pipelines. Our 100+ contributors continue to provide
>> integrations with 3rd party systems through the implementation of new
>> hooks and operators, both of which are used in defining the tasks that
>> compose workflows.
>> 
>> == Current Status ==
>> 
>> === Meritocracy ===
>> 
>> Our intent with this incubator proposal is to start building a diverse
>> developer community around Airflow following the Apache meritocracy
>> model. Since Airflow was open-sourced in mid-2015, we have had fast
>> adoption and contributions by multiple organizations the world over.
>> We plan to continue to support new contributors and we will work to
>> actively promote those who contribute significantly to the project to
>> committers.
>> 
>> === Community ===
>> 
>> Airflow is currently being used in over 30 companies. We hope to
>> extend our contributor base significantly and invite all those who are
>> interested in building large-scale distributed systems to participate.
>> 
>> === Core Developers ===
>> 
>> Airflow is currently being developed by four engineers: Maxime
>> Beauchemin, Siddharth Anand, Bolke de Bruin, and Chris Riccomini.
>> Chris is a member of the Apache Samza PMC and a contributor to various
>> Apache projects, including Apache Kafka and Apache YARN. Maxime,
>> Siddharth, and Bolke have contributed to Airflow.
>> 
>> === Alignment ===
>> The ASF is the natural choice to host the Airflow project as its goal
>> of encouraging community-driven open-source projects fits with our
>> vision for Airflow.
>> 
>> == Known Risks ==
>> 
>> === Orphaned Products ===
>> 
>> The core developers plan to work part time on the project. There is
>> very little risk of Airflow being abandoned as all of our companies
>> rely on it.
>> 
>> === Inexperience with Open Source ===
>> 
>> All of the core developers have experience with open source
>> development. Chris is a member of the Apache Samza PMC and a
>> contributor to various Apache projects, including Apache Kafka and
>> Apache YARN. Bolke is contributor on multiple open source projects and
>> a few Apache projects as well, including Apache Hive, Apache Hadoop,
>> and Apache Ranger.
>> 
>> === Homogeneous Developers ===
>> 
>> The current core developers are all from different companies. Our
>> community of 100 contributors hail from over 30 different companies
>> from across the world.
>> 
>> === Reliance on Salaried Developers ===
>> 
>> Currently, the only developer paid to work on this project is Maxime.
>> 
>> === Relationships with Other Apache Products ===
>> 
>> Airflow is deeply integrated with Apache products. It currently
>> provides hooks and operators to enable workflows to leverage Apache
>> Pig, Apache Hive, Apache Spark, Apache Sqoop, Apache Hadoop, etc… We
>> plan to add support for other Apache projects in the future.
>> 
>> === An Excessive Fascination with the Apache Brand ===
>> 
>> While we respect the reputation of the Apache brand and have no doubts
>> that it will attract contributors and users, our interest is primarily
>> to give Airflow a solid home as an open source project following an
>> established development model. We have also given reasons in the
>> Rationale and Alignment sections.
>> 
>> == Documentation ==
>> http://wiki.apache.org/incubator/AirflowProposal
>> 
>> == Initial Source ==
>> https://github.com/airbnb/airflow
>> 
>> == Source and Intellectual Property Submission Plan ==
>> 
>> As soon as Airflow is approved to join Apache Incubator, Airbnb will
>> execute a Software Grant Agreement and the source code will be
>> transitioned onto ASF infrastructure. The code is already licensed
>> under the Apache Software License, version 2.0. We know of no legal
>> encumberments that would inhibit the transfer of source code to the
>> ASF.
>> 
>> == External Dependencies ==
>> 
>> The dependencies all have Apache compatible licenses.
>> 
>> * [[
>> https://bitbucket.org/zzzeek/alembic/src/9538c3e1a71c946a53f8762e68e94cfbcb9f932f/LICENSE?fileviewer=file-view-default|alembic
>> (MIT)]]
>> * [[https://github.com/boto/boto/blob/develop/LICENSE|boto (MIT)]]
>> * [[https://github.com/celery/celery/blob/master/LICENSE|celery (BSD)]]
>> * [[https://github.com/mher/chartkick.py/blob/master/LICENSE|chartkick
>> (MIT)]]
>> * [[
>> https://github.com/pyca/cryptography/blob/master/LICENSE.APACHE|cryptography
>> (Apache 2.0/BSD)]]
>> * [[
>> https://bitbucket.org/ned/coveragepy/src/b74c40b2c107db17f0775be5ec6c44f5e1cf5cbf/LICENSE.txt?fileviewer=file-view-default|coverage
>> (Apache 2.0)]]
>> * [[
>> https://github.com/coagulant/coveralls-python/blob/master/LICENCE|coveralls
>> (MIT)]]
>> * [[https://pypi.python.org/pypi/croniter|croniter (MIT)]]
>> * [[https://github.com/uqfoundation/dill/blob/master/LICENSE|dill (BSD)]]
>> * [[https://github.com/docker/docker-py/blob/master/LICENSE|docker-py
>> (Apache 2.0)]]
>> * [[
>> https://bitbucket.org/fabian/filechunkio/src/84289d7599a207f575cb28db719dd9d44e880208/LICENCE?fileviewer=file-view-default|filechunkio
>> (MIT)]]
>> * [[
>> https://bitbucket.org/tarek/flake8/src/a209fb69350c572c9b2d7b4b09c7657be153be5e/LICENSE?fileviewer=file-view-default|flake8
>> (MIT)]]
>> * [[https://github.com/mitsuhiko/flask/blob/master/LICENSE|flask (BSD)]]
>> * [[
>> https://github.com/flask-admin/flask-admin/blob/master/LICENSE|flask-admin
>> (BSD)]]
>> * [[
>> https://github.com/thadeusb/flask-cache/blob/master/LICENSE|flask-cache
>> (BSD)]]
>> * [[
>> https://github.com/maxcountryman/flask-login/blob/master/LICENSE|flask-login
>> (MIT)]]
>> * [[https://github.com/mher/flower/blob/master/LICENSE|flower (BSD)]]
>> * [[
>> https://github.com/PythonCharmers/python-future/blob/master/LICENSE.txt|future
>> (MIT)]]
>> * [[https://github.com/benoitc/gunicorn/blob/master/LICENSE|gunicorn
>> (MIT)]]
>> * [[
>> https://github.com/youngwookim/hive-thrift-py/blob/master/setup.py|hive-thrift-py
>> (Apache 2.0)]]
>> * [[https://github.com/ipython/ipython/blob/master/COPYING.rst|ipython
>> (BSD)]]
>> * [[https://github.com/mitsuhiko/jinja2/blob/master/LICENSE|jinja2
>> (BSD)]]
>> * [[
>> https://github.com/waylan/Python-Markdown/blob/master/LICENSE.md|markdown
>> (BSD)]]
>> * [[https://github.com/pydata/pandas/blob/master/LICENSE|pandas (BSD)]]
>> * [[https://pypi.python.org/pypi/Pygments|pygments (BSD)]]
>> * pyhive
>> * pydruid
>> * PyOpenSSL
>> * PySmbClient
>> * python-dateutil
>> * redis
>> * requests
>> * setproctitle
>> * statsd
>> * sphinx
>> * sphinx-argparse
>> * sphinx_rtd_theme
>> * Sphinx-PyPI-upload
>> * sqlalchemy (MIT)
>> * thrift
>> * jaydebeapi
>> * mysqlclient
>> * unicodecsv
>> * slackclient
>> * ldap3
>> * Flask-WTF
>> * lxml
>> * [[https://github.com/bgamble/pykerberos/blob/master/LICENSE|pykerberos
>> (Apache 2.0)]]
>> * [[https://github.com/pyca/bcrypt/blob/master/LICENSE|bcrypt (Apache
>> 2.0)]]
>> * [[
>> https://github.com/maxcountryman/flask-bcrypt/blob/master/LICENSE|flask-bcrypt
>> (BSD)]]
>> * [[https://github.com/testing-cabal/mock/blob/master/LICENSE.txt|mock
>> (BSD)]]
>> * [[https://github.com/mtth/hdfs/blob/master/LICENSE|hdfs (MIT)]]
>> 
>> == Cryptography ==
>> 
>> None
>> 
>> == Required Resources ==
>> 
>> === Mailing Lists ===
>> 
>> * private@airflow.incubator.apache.org (moderated)
>> * dev@airflow.incubator.apache.org
>> * commits@airflow.incubator.apache.org
>> 
>> === Subversion Directory ===
>> 
>> Git is the preferred source control system: git://git.apache.org/Airflow
>> 
>> === Issue Tracking ===
>> 
>> JIRA Airflow (Airflow)
>> 
>> === Other Resources ===
>> 
>> The existing code already has unit tests, so we would like a Travis
>> instance to run them whenever a new patch is submitted. This can be
>> added after project creation.
>> 
>> == Initial Committers ==
>> 
>> * Maxime Beauchemin
>> * Siddharth Anand
>> * Chris Riccomini
>> * Bolke de Bruin
>> * Arthur Wiedmer
>> * Dan Davydov
>> * Jeremiah Lowin
>> * Patrick Leo Tardif
>> 
>> == Affiliations ==
>> 
>> * Maxime Beauchemin (Airbnb)
>> * Siddharth Anand (Agari)
>> * Chris Riccomini (WePay)
>> * Bolke de Bruin (ING)
>> * Arthur Wiedmer (Airbnb)
>> * Dan Davydov (Airbnb)
>> * Jeremiah Lowin (Kokino)
>> * Patrick Leo Tardif (Airbnb)
>> 
>> == Sponsors ==
>> 
>> === Champion ===
>> 
>> Chris Riccomini (WePay, Apache Samza PMC)
>> 
>> === Nominated Mentors ===
>> 
>> * Chris Nauroth (HortonWorks, Apache Hadoop Committer/PMC Member,
>> Apache ZooKeeper Committer, Apache Software Foundation Member)
>> * Hitesh Shah (HortonWorks, Apache Hadoop Committer/PMC Member,
>> Apache Ambari Committer/PMC Member, Apache Tez Committer/PMC Member,
>> Apache Software Foundation Member)
>> * Jakob Homan (OfferUp, Apache Hadoop Committer/PMC Member, Apache
>> Kafka Committer/PMC Member, Apache Samza Committer/PMC Member, Apache
>> Giraph Committer/PMC Member,  Apache Software Foundation Member)
>> 
>> === Sponsoring Entity ===
>> 
>> We are requesting the Incubator to sponsor this project.
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message