airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Van Boxel <a...@vanboxel.be>
Subject Re: Airflow 1.8.0 Release Candidate 1
Date Tue, 07 Feb 2017 16:23:07 GMT
OK, a bit of history... I moved for beta 4 to rc 1 and though I didn't have
problems because it run'ed fine locally for testing. But my production set
is on K8S + Redis (Celery). *What I saw in production was the **import
errors on the DAGS,* first I thought it was due to the fix of showing
errors in the DAG in the UI or my own building. But it wasn't. I had to go
though every commit between beta 4 and rc1 to find the error (build docker
image, deploy in k8s).

So to clarify (I'm sure this is the commit now):

   1. I did a before and after commit test.
   2. I now installed RC1, but with that single line reverted (works after,
   broken before)

I can also only reproduce it in the production, but it's been running on
master till beta 4 quite ok. I don't really know why it does that (a bit
the downside of dynamic typing here...) and actually don't want to dig
further. I've lost day's of valuable time (although I learned a lot about
Python dynamic loading).

*Or we find the problem with the help of Feng (the implementor) or we
revert that single commit*. Or is somebody else running Celery that is not
having this problem?






On Tue, Feb 7, 2017 at 3:57 PM Bolke de Bruin <bdbruin@gmail.com> wrote:

> Hey Alex,
>
> Thanks for tracking it down. Can you elaborate want went wrong with
> celery? The lines below do not particularly relate to Celery directly, so I
> wonder why we are not seeing it with LocalExecutor?
>
> Cheers
> Bolke
>
> > On 7 Feb 2017, at 15:51, Alex Van Boxel <alex@vanboxel.be> wrote:
> >
> > I have to give the RC1 a *-1*. I spend hours, or better days to get the
> RC
> > running with Celery on our test environment, till I finally found the
> > commit that killed it:
> >
> > e7f6212cae82c3a3a0bc17bbcbc70646f67d02eb
> > [AIRFLOW-813] Fix unterminated unit tests in SchedulerJobTest
> > Closes #2032 from fenglu-g/master
> >
> > I was always looking at the wrong this, because the commit only changes a
> > single default parameter from *None to -1*
> >
> > I do have the impression I'm the only one running with Celery. Are other
> > people running with it?
> >
> > *I propose* *reverting the commit*. Feng, can you elaborate on this
> change?
> >
> > Change the default back no *None* in cli.py got it finally working:
> >
> > 'num_runs': Arg(
> >    ("-n", "--num_runs"),
> >    default=None, type=int,
> >    help="Set the number of runs to execute before exiting"),
> >
> > Thanks.
> >
> > On Tue, Feb 7, 2017 at 3:49 AM siddharth anand <sanand@apache.org>
> wrote:
> >
> > I did get 1.8.0 installed and running at Agari.
> >
> > I did run into 2 problems.
> > 1. Most of our DAGs broke due the way Operators are now imported.
> >
> https://github.com/apache/incubator-airflow/blob/master/UPDATING.md#deprecated-features
> >
> > According to the documentation, these deprecations would only cause an
> > issue in 2.0. However, I needed to fix them now.
> >
> > So, I needed to change "from airflow.operators import PythonOperator" to
> > from "from airflow.operators.python_operator import PythonOperator". Am I
> > missing something?
> >
> > 2. I ran into a migration problem that seems to have cleared itself up. I
> > did notice that some dags do not have data in their "DAG Runs" column on
> > the overview page computed. I am looking into that issue presently.
> >
> https://www.dropbox.com/s/cn058mtu3vcv8sq/Screenshot%202017-02-06%2018.45.07.png?dl=0
> >
> > -s
> >
> > On Mon, Feb 6, 2017 at 4:30 PM, Dan Davydov <dan.davydov@airbnb.com
> .invalid>
> > wrote:
> >
> >> Bolke, attached is the patch for the cgroups fix. Let me know which
> >> branches you would like me to merge it to. If anyone has complaints
> about
> >> the patch let me know (but it does not touch the core of airflow, only
> the
> >> new cgroups task runner).
> >>
> >> On Mon, Feb 6, 2017 at 4:24 PM, siddharth anand <sanand@apache.org>
> wrote:
> >>
> >>> Actually, I see the error is further down..
> >>>
> >>>  File
> >>> "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py",
> >>> line
> >>> 469, in do_execute
> >>>
> >>>    cursor.execute(statement, parameters)
> >>>
> >>> sqlalchemy.exc.IntegrityError: (psycopg2.IntegrityError) null value in
> >>> column "dag_id" violates not-null constraint
> >>>
> >>> DETAIL:  Failing row contains (null, running, 1, f).
> >>>
> >>> [SQL: 'INSERT INTO dag_stats (state, count, dirty) VALUES (%(state)s,
> >>> %(count)s, %(dirty)s)'] [parameters: {'count': 1L, 'state': u'running',
> >>> 'dirty': False}]
> >>>
> >>> It looks like an autoincrement is missing for this table.
> >>>
> >>>
> >>> I'm running `SQLAlchemy==1.1.4` - I see our setup.py specifies any
> > version
> >>> greater than 0.9.8
> >>>
> >>> -s
> >>>
> >>>
> >>>
> >>> On Mon, Feb 6, 2017 at 4:11 PM, siddharth anand <sanand@apache.org>
> >>> wrote:
> >>>
> >>>> I tried upgrading to 1.8.0rc1 from 1.7.1.3 via pip install
> >>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/
> >>>> airflow-1.8.0rc1+apache.incubating.tar.gz and then running airflow
> >>>> upgradedb didn't quite work. First, I thought it completed
> > successfully,
> >>>> then saw errors some tables were indeed missing. I ran it again and
> >>>> encountered the following exception :
> >>>>
> >>>> DB: postgresql://app_cousteau@db-cousteau.ep.stage.agari.com:543
> >>> 2/airflow
> >>>>
> >>>> [2017-02-07 00:03:20,309] {db.py:284} INFO - Creating tables
> >>>>
> >>>> INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
> >>>>
> >>>> INFO  [alembic.runtime.migration] Will assume transactional DDL.
> >>>>
> >>>> INFO  [alembic.runtime.migration] Running upgrade 2e82aab8ef20 ->
> >>>> 211e584da130, add TI state index
> >>>>
> >>>> INFO  [alembic.runtime.migration] Running upgrade 211e584da130 ->
> >>>> 64de9cddf6c9, add task fails journal table
> >>>>
> >>>> INFO  [alembic.runtime.migration] Running upgrade 64de9cddf6c9 ->
> >>>> f2ca10b85618, add dag_stats table
> >>>>
> >>>> INFO  [alembic.runtime.migration] Running upgrade f2ca10b85618 ->
> >>>> 4addfa1236f1, Add fractional seconds to mysql tables
> >>>>
> >>>> INFO  [alembic.runtime.migration] Running upgrade 4addfa1236f1 ->
> >>>> 8504051e801b, xcom dag task indices
> >>>>
> >>>> INFO  [alembic.runtime.migration] Running upgrade 8504051e801b ->
> >>>> 5e7d17757c7a, add pid field to TaskInstance
> >>>>
> >>>> INFO  [alembic.runtime.migration] Running upgrade 5e7d17757c7a ->
> >>>> 127d2bf2dfa7, Add dag_id/state index on dag_run table
> >>>>
> >>>> /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/crud.py:692:
> >>>> SAWarning: Column 'dag_stats.dag_id' is marked as a member of the
> >>> primary
> >>>> key for table 'dag_stats', but has no Python-side or server-side
> > default
> >>>> generator indicated, nor does it indicate 'autoincrement=True' or
> >>>> 'nullable=True', and no explicit value is passed.  Primary key columns
> >>>> typically may not store NULL. Note that as of SQLAlchemy 1.1,
> >>>> 'autoincrement=True' must be indicated explicitly for composite (e.g.
> >>>> multicolumn) primary keys if AUTO_INCREMENT/SERIAL/IDENTITY behavior
> is
> >>>> expected for one of the columns in the primary key. CREATE TABLE
> >>> statements
> >>>> are impacted by this change as well on most backends.
> >>>>
> >>>
> >>
> >>
> >
> > --
> >  _/
> > _/ Alex Van Boxel
>
> --
  _/
_/ Alex Van Boxel

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message