airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prateek Rungta <prung...@gmail.com>
Subject Re: Why do we need SQLite in Airflow?
Date Wed, 04 May 2016 06:57:07 GMT
It's documented pretty well that it's only for people to get their feet wet
with. From the quickstart
<http://pythonhosted.org/airflow/start.html?highlight=sqlite>:

Out of the box, Airflow uses a sqlite database, which you should outgrow
fairly quickly since no parallelization is possible using this database
backend. It works in conjunction with the SequentialExecutor which will
only run task instances sequentially. While this is very limiting, it
allows you to get up and running quickly and take a tour of the UI and the
command line utilities.

FWIW, I'm now on day 2 of using Airflow. And while I wouldn't dream of
deploying Airflow using SQLite beyond my laptop, I quite appreciated being
able to mess with Airflow without any of the infrastructural constraints.



On Tue, May 3, 2016 at 11:18 PM, Siddharth Anand <sanand@apache.org> wrote:

> From time to time, we run into bugs with the SQLite dialect in SQLAlchemy
> and close the bugs as "wont-fix" because we don't want to be in the
> business of fixing such bug. We deem SQLite as a "non-serious" database
> that no one [in his/her right mind] would run in his/her staging, qa, or
> production environments. However, we rely on the SequentialExecutor and one
> the SQLite DB for our tests.
> What should we do with SQLite? Should we lift up the hood and fix it for
> our needs or find either a different ORM or a different option for DB
> backend?
> Example of bugs we encounter and close as won't fix : 1. Deleting a task
> instance : https://github.com/airbnb/airflow/issues/9552. Weird pickle
> issue : https://issues.apache.org/jira/browse/AIRFLOW-46
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message