airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bolke de Bruin <bdbr...@gmail.com>
Subject Re: Airflow 1.8.0 Release Candidate 1
Date Wed, 08 Feb 2017 12:33:01 GMT
Alex,

Do you have anything more to go on? I don’t mind reverting the patch, however it code part
seems unrelated to what you described and the issue wasn’t reproducible. I would really
like to see more logging and maybe a test in a clean environment plus debugging. Preferable
I would like to make RC 2 available today and immediately raise a vote as the *current* changes
are really small, are confined to contrib and have been tested by the people using it.

But I am holding off for now due to your concern.

Cheers
Bolke


> On 7 Feb 2017, at 20:56, Bolke de Bruin <bdbruin@gmail.com> wrote:
> 
> How do you start the scheduler Alex? What are the command line parameters? What are the
logs when it doesn’t work?
> 
> Bolke
> 
> 
> 
>> On 7 Feb 2017, at 18:52, Alex Van Boxel <alex@vanboxel.be <mailto:alex@vanboxel.be>>
wrote:
>> 
>> Hey Feng,
>> 
>> The upgrades are all automated (including the workers/web/scheduler). And I tripple
checked, I now am test running RC1 just with the your line reverted (and look ok)
>> 
>> Could you do me a favour and add a test dag where you do a local import. Example:
>> 
>> bqschema.py
>> def ranking():
>>     return [
>>         {"name": "bucket_date", "type": "timestamp", "mode": "nullable"},
>>         {"name": "rank", "type": "integer", "mode": "nullable"},
>>         {"name": "audience_preference", "type": "float", "mode": "nullable"},
>>         {"name": "audience_likelihood_share", "type": "float", "mode": "nullable"}
>>     ]
>> 
>> dag.py
>> import bqschema
>> ...
>> all in the same dag folder. We use it to define out BigQuery schema's into a seperate
file.
>> 
>> 
>> On Tue, Feb 7, 2017 at 6:37 PM Feng Lu <fenglu@google.com.invalid <mailto:fenglu@google.com.invalid>>
wrote:
>> Hi Alex-
>> 
>> Please see the attached screenshots of my local testing using celeryexecutor (on
k8s as well). 
>> All look good and the workflow is successfully completed.
>> 
>> Curious did you also update the worker image? 
>> Sorry for the confusion, happy to debug more if you could share with me your k8s
setup. 
>> 
>> Feng
>> 
>> On Tue, Feb 7, 2017 at 8:37 AM, Feng Lu <fenglu@google.com <mailto:fenglu@google.com>>
wrote:
>> When num_runs is not explicitly specified, the default is set to -1 to match the
expectation of SchedulerJob here:
>> <Screen Shot 2017-02-07 at 8.01.26 AM.png>
>> ​
>> Doing so also matches the type of num_runs ('int' in this case).
>> The scheduler will run non-stop as a result regardless whether dag files are present
(since the num_runs default is now -1: unlimited). 
>> 
>> Based on what Alex described, the import error doesn't look like directly related
to this change. 
>> Maybe this one? https://github.com/apache/incubator-airflow/commit/67cbb966410226c1489bb730af3af45330fc51b9
<https://github.com/apache/incubator-airflow/commit/67cbb966410226c1489bb730af3af45330fc51b9>
>> 
>> I am still in the middle of running some quick test using celery executor, will update
the thread once it's done. 
>> 
>> 
>> On Tue, Feb 7, 2017 at 6:56 AM, Bolke de Bruin <bdbruin@gmail.com <mailto:bdbruin@gmail.com>>
wrote:
>> Hey Alex,
>> 
>> Thanks for tracking it down. Can you elaborate want went wrong with celery? The lines
below do not particularly relate to Celery directly, so I wonder why we are not seeing it
with LocalExecutor?
>> 
>> Cheers
>> Bolke
>> 
>> > On 7 Feb 2017, at 15:51, Alex Van Boxel <alex@vanboxel.be <mailto:alex@vanboxel.be>>
wrote:
>> >
>> > I have to give the RC1 a *-1*. I spend hours, or better days to get the RC
>> > running with Celery on our test environment, till I finally found the
>> > commit that killed it:
>> >
>> > e7f6212cae82c3a3a0bc17bbcbc70646f67d02eb
>> > [AIRFLOW-813] Fix unterminated unit tests in SchedulerJobTest
>> > Closes #2032 from fenglu-g/master
>> >
>> > I was always looking at the wrong this, because the commit only changes a
>> > single default parameter from *None to -1*
>> >
>> > I do have the impression I'm the only one running with Celery. Are other
>> > people running with it?
>> >
>> > *I propose* *reverting the commit*. Feng, can you elaborate on this change?
>> >
>> > Change the default back no *None* in cli.py got it finally working:
>> >
>> > 'num_runs': Arg(
>> >    ("-n", "--num_runs"),
>> >    default=None, type=int,
>> >    help="Set the number of runs to execute before exiting"),
>> >
>> > Thanks.
>> >
>> > On Tue, Feb 7, 2017 at 3:49 AM siddharth anand <sanand@apache.org <mailto:sanand@apache.org>>
wrote:
>> >
>> > I did get 1.8.0 installed and running at Agari.
>> >
>> > I did run into 2 problems.
>> > 1. Most of our DAGs broke due the way Operators are now imported.
>> > https://github.com/apache/incubator-airflow/blob/master/UPDATING.md#deprecated-features
<https://github.com/apache/incubator-airflow/blob/master/UPDATING.md#deprecated-features>
>> >
>> > According to the documentation, these deprecations would only cause an
>> > issue in 2.0. However, I needed to fix them now.
>> >
>> > So, I needed to change "from airflow.operators import PythonOperator" to
>> > from "from airflow.operators.python_operator import PythonOperator". Am I
>> > missing something?
>> >
>> > 2. I ran into a migration problem that seems to have cleared itself up. I
>> > did notice that some dags do not have data in their "DAG Runs" column on
>> > the overview page computed. I am looking into that issue presently.
>> > https://www.dropbox.com/s/cn058mtu3vcv8sq/Screenshot%202017-02-06%2018.45.07.png?dl=0
<https://www.dropbox.com/s/cn058mtu3vcv8sq/Screenshot%202017-02-06%2018.45.07.png?dl=0>
>> >
>> > -s
>> >
>> > On Mon, Feb 6, 2017 at 4:30 PM, Dan Davydov <dan.davydov@airbnb.com <mailto:dan.davydov@airbnb.com>.invalid>
>> > wrote:
>> >
>> >> Bolke, attached is the patch for the cgroups fix. Let me know which
>> >> branches you would like me to merge it to. If anyone has complaints about
>> >> the patch let me know (but it does not touch the core of airflow, only the
>> >> new cgroups task runner).
>> >>
>> >> On Mon, Feb 6, 2017 at 4:24 PM, siddharth anand <sanand@apache.org <mailto:sanand@apache.org>>
wrote:
>> >>
>> >>> Actually, I see the error is further down..
>> >>>
>> >>>  File
>> >>> "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py",
>> >>> line
>> >>> 469, in do_execute
>> >>>
>> >>>    cursor.execute(statement, parameters)
>> >>>
>> >>> sqlalchemy.exc.IntegrityError: (psycopg2.IntegrityError) null value
in
>> >>> column "dag_id" violates not-null constraint
>> >>>
>> >>> DETAIL:  Failing row contains (null, running, 1, f).
>> >>>
>> >>> [SQL: 'INSERT INTO dag_stats (state, count, dirty) VALUES (%(state)s,
>> >>> %(count)s, %(dirty)s)'] [parameters: {'count': 1L, 'state': u'running',
>> >>> 'dirty': False}]
>> >>>
>> >>> It looks like an autoincrement is missing for this table.
>> >>>
>> >>>
>> >>> I'm running `SQLAlchemy==1.1.4` - I see our setup.py specifies any
>> > version
>> >>> greater than 0.9.8
>> >>>
>> >>> -s
>> >>>
>> >>>
>> >>>
>> >>> On Mon, Feb 6, 2017 at 4:11 PM, siddharth anand <sanand@apache.org
<mailto:sanand@apache.org>>
>> >>> wrote:
>> >>>
>> >>>> I tried upgrading to 1.8.0rc1 from 1.7.1.3 via pip install
>> >>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/ <https://dist.apache.org/repos/dist/dev/incubator/airflow/>
>> >>>> airflow-1.8.0rc1+apache.incubating.tar.gz and then running airflow
>> >>>> upgradedb didn't quite work. First, I thought it completed
>> > successfully,
>> >>>> then saw errors some tables were indeed missing. I ran it again
and
>> >>>> encountered the following exception :
>> >>>>
>> >>>> DB: postgresql://app_cousteau@db-cousteau.ep.stage.agari.com:543
<http://app_cousteau@db-cousteau.ep.stage.agari.com:543/>
>> >>> 2/airflow
>> >>>>
>> >>>> [2017-02-07 00:03:20,309] {db.py:284} INFO - Creating tables
>> >>>>
>> >>>> INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
>> >>>>
>> >>>> INFO  [alembic.runtime.migration] Will assume transactional DDL.
>> >>>>
>> >>>> INFO  [alembic.runtime.migration] Running upgrade 2e82aab8ef20 ->
>> >>>> 211e584da130, add TI state index
>> >>>>
>> >>>> INFO  [alembic.runtime.migration] Running upgrade 211e584da130 ->
>> >>>> 64de9cddf6c9, add task fails journal table
>> >>>>
>> >>>> INFO  [alembic.runtime.migration] Running upgrade 64de9cddf6c9 ->
>> >>>> f2ca10b85618, add dag_stats table
>> >>>>
>> >>>> INFO  [alembic.runtime.migration] Running upgrade f2ca10b85618 ->
>> >>>> 4addfa1236f1, Add fractional seconds to mysql tables
>> >>>>
>> >>>> INFO  [alembic.runtime.migration] Running upgrade 4addfa1236f1 ->
>> >>>> 8504051e801b, xcom dag task indices
>> >>>>
>> >>>> INFO  [alembic.runtime.migration] Running upgrade 8504051e801b ->
>> >>>> 5e7d17757c7a, add pid field to TaskInstance
>> >>>>
>> >>>> INFO  [alembic.runtime.migration] Running upgrade 5e7d17757c7a ->
>> >>>> 127d2bf2dfa7, Add dag_id/state index on dag_run table
>> >>>>
>> >>>> /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/crud.py:692:
>> >>>> SAWarning: Column 'dag_stats.dag_id' is marked as a member of the
>> >>> primary
>> >>>> key for table 'dag_stats', but has no Python-side or server-side
>> > default
>> >>>> generator indicated, nor does it indicate 'autoincrement=True' or
>> >>>> 'nullable=True', and no explicit value is passed.  Primary key columns
>> >>>> typically may not store NULL. Note that as of SQLAlchemy 1.1,
>> >>>> 'autoincrement=True' must be indicated explicitly for composite
(e.g.
>> >>>> multicolumn) primary keys if AUTO_INCREMENT/SERIAL/IDENTITY behavior
is
>> >>>> expected for one of the columns in the primary key. CREATE TABLE
>> >>> statements
>> >>>> are impacted by this change as well on most backends.
>> >>>>
>> >>>
>> >>
>> >>
>> >
>> > --
>> >  _/
>> > _/ Alex Van Boxel
>> 
>> 
>> -- 
>>   _/
>> _/ Alex Van Boxel
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message