airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Riccomini <criccom...@apache.org>
Subject Re: Why do we need SQLite in Airflow?
Date Wed, 04 May 2016 20:28:57 GMT
> As far as ease of use, while docker is definitely getting more popular, it
is hard to beat the current pip install flow for people not quite up to date
on how to setup docker. It seems like one more hurdle if you just want to
get started.

Strongly agree. We tried to use Vagrant and then Docker with a prior
project, and it was a pain. Another project that I'm working with now uses
Docker for its hello-world stuff, and it's really troublesome. You will get
WAY more questions if you go this route than the current simple pip/sqlite
route.

On Wed, May 4, 2016 at 12:27 PM, Maxime Beauchemin <
maximebeauchemin@gmail.com> wrote:

> Yeah I'd be curious to see how the Docker setup instructions (from scratch)
> would compare to the current ones.
>
> On Wed, May 4, 2016 at 11:05 AM, Arthur Wiedmer <arthur.wiedmer@gmail.com>
> wrote:
>
> > +1, but it feels like just piling on.
> >
> > One thing we could consider is which part we would like to fix.
> >
> > - If it is the seriousness/production ready db, but that is still a local
> > db/client, we could try something like firebird.
> > Relatively small footprint and can do multithreading, it is supported by
> > SQLAlchemy, though it is not as easy to install as sqlite on most *nixes.
> > We could spend some cycles baking this into containers as well.
> >
> > - As far as ease of use, while docker is definitely getting more popular,
> > it is hard to beat the current pip install flow for people not quite up
> to
> > date on how to setup docker. It seems like one more hurdle if you just
> want
> > to get started.
> >
> > Best,
> > Arthur
> >
> >
> > On Wed, May 4, 2016 at 9:35 AM, Maxime Beauchemin <
> > maximebeauchemin@gmail.com> wrote:
> >
> > > Making it frictionless for people to get their feet wet is extremely
> > > important. It's been a requirement since the early prototypes and I
> feel
> > > strongly about keeping it that way. It's hard to test this hypothesis,
> > but
> > > it could be a defining factor in the success of this project (to-date
> and
> > > future).
> > >
> > > Docker may allow for more batteries to be included and offer even less
> > > friction than the `pip install` path for folks who are familiar with
> it.
> > > I'd have to look to see if the community contributed Docker images are
> up
> > > to date. We may want to make that "the way to go" and change the
> > tutorial /
> > > quick start instructions to reflect that if it makes sense. That may
> > > require integrating the burning of images as part of the build and/or
> > > release process.
> > >
> > > Max
> > >
> > > On Wed, May 4, 2016 at 6:33 AM, Jeremiah Lowin <jlowin@apache.org>
> > wrote:
> > >
> > > > +1, shipping Airflow "batteries included" is very important in my
> > > opinion.
> > > > There is a lot to grok and the easiest way to learn is by letting
> folks
> > > > spin up a working installation right away. Unfortunately I don't
> think
> > > > there's a viable alternative to SQLite that is also supported by
> > > > SQLAlchemy.
> > > >
> > > > On Wed, May 4, 2016 at 2:57 AM Prateek Rungta <prungta2@gmail.com>
> > > wrote:
> > > >
> > > > > It's documented pretty well that it's only for people to get their
> > feet
> > > > wet
> > > > > with. From the quickstart
> > > > > <http://pythonhosted.org/airflow/start.html?highlight=sqlite>:
> > > > >
> > > > > Out of the box, Airflow uses a sqlite database, which you should
> > > outgrow
> > > > > fairly quickly since no parallelization is possible using this
> > database
> > > > > backend. It works in conjunction with the SequentialExecutor which
> > will
> > > > > only run task instances sequentially. While this is very limiting,
> it
> > > > > allows you to get up and running quickly and take a tour of the UI
> > and
> > > > the
> > > > > command line utilities.
> > > > >
> > > > > FWIW, I'm now on day 2 of using Airflow. And while I wouldn't dream
> > of
> > > > > deploying Airflow using SQLite beyond my laptop, I quite
> appreciated
> > > > being
> > > > > able to mess with Airflow without any of the infrastructural
> > > constraints.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, May 3, 2016 at 11:18 PM, Siddharth Anand <
> sanand@apache.org>
> > > > > wrote:
> > > > >
> > > > > > From time to time, we run into bugs with the SQLite dialect
in
> > > > SQLAlchemy
> > > > > > and close the bugs as "wont-fix" because we don't want to be
in
> the
> > > > > > business of fixing such bug. We deem SQLite as a "non-serious"
> > > database
> > > > > > that no one [in his/her right mind] would run in his/her staging,
> > qa,
> > > > or
> > > > > > production environments. However, we rely on the
> SequentialExecutor
> > > and
> > > > > one
> > > > > > the SQLite DB for our tests.
> > > > > > What should we do with SQLite? Should we lift up the hood and
fix
> > it
> > > > for
> > > > > > our needs or find either a different ORM or a different option
> for
> > DB
> > > > > > backend?
> > > > > > Example of bugs we encounter and close as won't fix : 1.
> Deleting a
> > > > task
> > > > > > instance : https://github.com/airbnb/airflow/issues/9552. Weird
> > > pickle
> > > > > > issue : https://issues.apache.org/jira/browse/AIRFLOW-46
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message