airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxime Beauchemin <maximebeauche...@gmail.com>
Subject Re: Why do we need SQLite in Airflow?
Date Wed, 04 May 2016 16:35:33 GMT
Making it frictionless for people to get their feet wet is extremely
important. It's been a requirement since the early prototypes and I feel
strongly about keeping it that way. It's hard to test this hypothesis, but
it could be a defining factor in the success of this project (to-date and
future).

Docker may allow for more batteries to be included and offer even less
friction than the `pip install` path for folks who are familiar with it.
I'd have to look to see if the community contributed Docker images are up
to date. We may want to make that "the way to go" and change the tutorial /
quick start instructions to reflect that if it makes sense. That may
require integrating the burning of images as part of the build and/or
release process.

Max

On Wed, May 4, 2016 at 6:33 AM, Jeremiah Lowin <jlowin@apache.org> wrote:

> +1, shipping Airflow "batteries included" is very important in my opinion.
> There is a lot to grok and the easiest way to learn is by letting folks
> spin up a working installation right away. Unfortunately I don't think
> there's a viable alternative to SQLite that is also supported by
> SQLAlchemy.
>
> On Wed, May 4, 2016 at 2:57 AM Prateek Rungta <prungta2@gmail.com> wrote:
>
> > It's documented pretty well that it's only for people to get their feet
> wet
> > with. From the quickstart
> > <http://pythonhosted.org/airflow/start.html?highlight=sqlite>:
> >
> > Out of the box, Airflow uses a sqlite database, which you should outgrow
> > fairly quickly since no parallelization is possible using this database
> > backend. It works in conjunction with the SequentialExecutor which will
> > only run task instances sequentially. While this is very limiting, it
> > allows you to get up and running quickly and take a tour of the UI and
> the
> > command line utilities.
> >
> > FWIW, I'm now on day 2 of using Airflow. And while I wouldn't dream of
> > deploying Airflow using SQLite beyond my laptop, I quite appreciated
> being
> > able to mess with Airflow without any of the infrastructural constraints.
> >
> >
> >
> > On Tue, May 3, 2016 at 11:18 PM, Siddharth Anand <sanand@apache.org>
> > wrote:
> >
> > > From time to time, we run into bugs with the SQLite dialect in
> SQLAlchemy
> > > and close the bugs as "wont-fix" because we don't want to be in the
> > > business of fixing such bug. We deem SQLite as a "non-serious" database
> > > that no one [in his/her right mind] would run in his/her staging, qa,
> or
> > > production environments. However, we rely on the SequentialExecutor and
> > one
> > > the SQLite DB for our tests.
> > > What should we do with SQLite? Should we lift up the hood and fix it
> for
> > > our needs or find either a different ORM or a different option for DB
> > > backend?
> > > Example of bugs we encounter and close as won't fix : 1. Deleting a
> task
> > > instance : https://github.com/airbnb/airflow/issues/9552. Weird pickle
> > > issue : https://issues.apache.org/jira/browse/AIRFLOW-46
> > >
> > >
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message