airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bolke de Bruin <bdbr...@gmail.com>
Subject Re: [VOTE] Airflow 1.9.0rc4
Date Thu, 30 Nov 2017 20:34:56 GMT
Patch here: https://github.com/apache/incubator-airflow/pull/2831

Bolke

> On 30 Nov 2017, at 20:53, Bolke de Bruin <bdbruin@gmail.com> wrote:
> 
> Ill have a patch in an hour or so
> 
> 
>> On 30 Nov 2017, at 19:28, Chris Riccomini <criccomini@apache.org> wrote:
>> 
>> K, I'm going to let this vote hang here until we get confirmation as to
>> what's going on.
>> 
>> @bolke/@ash, thanks for looking into this!
>> 
>> On Thu, Nov 30, 2017 at 10:23 AM, Bolke de Bruin <bdbruin@gmail.com> wrote:
>> 
>>> Ok, I think I can confirm the issue. I don’t see any output either with a
>>> clean install. I’ll investigate a bit further.
>>> 
>>> *sigh*
>>> 
>>> Bolke
>>> 
>>> 
>>>> On 30 Nov 2017, at 19:13, Ash Berlin-Taylor <ash_airflowlist@firemirror.
>>> com> wrote:
>>>> 
>>>> Yes, that is in my logging config.
>>>> 
>>>> I think I agree about airflow.task.raw
>>>> 
>>>> -ash
>>>> 
>>>> 
>>>>> On 30 Nov 2017, at 18:06, Joy Gao <joyg@wepay.com> wrote:
>>>>> 
>>>>> Hi Ash,
>>>>> 
>>>>> Can you verify that your logging config has root logger set to
>>> 'console'?
>>>>> Something like:
>>>>> 
>>>>>     '': {
>>>>>         'handlers': ['console'],
>>>>>         'level': LOG_LEVEL
>>>>>     },
>>>>> 
>>>>> 
>>>>> That was the issue we had because we attempted to set *all* handlers
to
>>>>> airflow.task at first.
>>>>> 
>>>>> I don't think we gain anything by calling airflow.task.raw logger, so
I
>>>>> have a pending PR that removes airflow.task.raw and just uses
>>> airflow.task.
>>>>> 
>>>>> 
>>>>> 
>>>>> On Thu, Nov 30, 2017 at 9:24 AM, Bolke de Bruin <bdbruin@gmail.com>
>>> wrote:
>>>>> 
>>>>>> We saw this issue around RC2, so I’m surprised that it turns up
again.
>>> We
>>>>>> discussed it at lengths with Fokko and Joy. It might be that we somehow
>>>>>> missed a commit, but I need to double check a from stock install.
We
>>> are
>>>>>> running 1.9 in production and we don’t have any logging issues.
>>>>>> 
>>>>>> B.
>>>>>> 
>>>>>> Verstuurd vanaf mijn iPad
>>>>>> 
>>>>>>> Op 30 nov. 2017 om 18:17 heeft Chris Riccomini <criccomini@apache.org
>>>> 
>>>>>> het volgende geschreven:
>>>>>>> 
>>>>>>> @bolke/@fokko, thoughts?
>>>>>>> 
>>>>>>> As an aside, we had an issue seeing task logs, but were able
to fix
>>> it by
>>>>>>> modifying our logging config (and perhaps patching some stuff
that we
>>>>>>> merged into 1.9.0, don't recall). William Pursell/Joy Gao might
be
>>> able
>>>>>> to
>>>>>>> comment more.
>>>>>>> 
>>>>>>> On Thu, Nov 30, 2017 at 9:08 AM, Ash Berlin-Taylor <
>>>>>>> ash_airflowlist@firemirror.com> wrote:
>>>>>>> 
>>>>>>>> If anyone else is seeing this (and I think I'm on stock configuration
>>>>>> now)
>>>>>>>> yes I do, as I'm not seeing _any_ logs form task operators
appearing.
>>>>>>>> 
>>>>>>>> There's another problem after applying a local fix for
>>>>>>>> `handler.set_context()`, to do with try_numbers:
>>>>>>>> 
>>>>>>>> The `airflow run tests test-logging 2017-11-26T00:00:00 --local
-sd
>>>>>>>> /usr/local/airflow/dags/example/.py` command sends logs to
>>> .../14.log
>>>>>>>> 
>>>>>>>> The `airflow run tests test-logging 2017-11-26T00:00:00 --job_id
33
>>>>>> --raw
>>>>>>>> -sd /usr/local/airflow/dags/example/.py` command (that the
previous
>>>>>>>> command invokes) sends logs to .../15.log
>>>>>>>> 
>>>>>>>> However the UI doesn't know about try 15 (yes try 15. I've
been
>>> running
>>>>>>>> this a bit) so any future runs will write to this file.
>>>>>>>> 
>>>>>>>> At this point I might be a bit deep and might need to look
with fresh
>>>>>> eyes
>>>>>>>> and a clean slate tomorrow. Someone confirming if they do
or don't
>>> see
>>>>>> logs
>>>>>>>> from inside an operator would help me.
>>>>>>>> 
>>>>>>>> -ash
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 30 Nov 2017, at 17:03, Chris Riccomini <criccomini@apache.org>
>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> @Ash, do you think this issue is blocking for 1.9.0?
>>>>>>>>> 
>>>>>>>>> On Thu, Nov 30, 2017 at 8:22 AM, Ash Berlin-Taylor <
>>>>>>>>> ash_airflowlist@firemirror.com> wrote:
>>>>>>>>> 
>>>>>>>>>> Ah, I think I've got it.
>>>>>>>>>> 
>>>>>>>>>> The problem is that airflow.task.raw doesn't have
an explicit
>>> config
>>>>>>>>>> anymore(?), so when we do:
>>>>>>>>>> 
>>>>>>>>>> log = logging.getLogger('airflow.task.raw')
>>>>>>>>>> 
>>>>>>>>>> and look at its handlers, it doesn't have any. But
>>> log.parent.handlers
>>>>>>>>>> _does_ have them.
>>>>>>>>>> 
>>>>>>>>>> A few options to fix this:
>>>>>>>>>> 
>>>>>>>>>> - I can add an explicit handler for .task.raw in
my logging config.
>>>>>>>>>> - We can always call set_context on airflow.task
instead of
>>>>>>>>>> airflow.task.raw
>>>>>>>>>> - We walk up the .parent links if .propagate is True
(possibly
>>>>>> stopping
>>>>>>>>>> once we find any handlers)
>>>>>>>>>> 
>>>>>>>>>> -ash
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On 30 Nov 2017, at 16:02, Ash Berlin-Taylor
>>>>>>>> <ash_airflowlist@firemirror.
>>>>>>>>>> com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> I'm fairly sure I'm back to stock config now.
Here is the DAG I am
>>>>>>>>>> testing with:
>>>>>>>>>>> 
>>>>>>>>>>> from datetime import datetime
>>>>>>>>>>> from airflow import DAG
>>>>>>>>>>> from airflow.operators.python_operator import
PythonOperator
>>>>>>>>>>> 
>>>>>>>>>>> dag_args = {
>>>>>>>>>>> "start_date": datetime(2017, 11, 20),
>>>>>>>>>>> }
>>>>>>>>>>> dag = DAG("tests", default_args=dag_args)
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> def print_stuff(ti, **kwargs):
>>>>>>>>>>> print("Hi from", __file__)
>>>>>>>>>>> ti.log.error("Hello")
>>>>>>>>>>> 
>>>>>>>>>>> with dag:
>>>>>>>>>>> PythonOperator(
>>>>>>>>>>>  task_id="test-logging",
>>>>>>>>>>>  dag=dag,
>>>>>>>>>>>  python_callable=print_stuff,
>>>>>>>>>>>  provide_context=True,
>>>>>>>>>>> )
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> And I don't see either of the "Hi from" or "Hello"
lines in the
>>>>>>>>>> resulting log file. I have edited airflow/logging_config.py
to
>>> print
>>>>>> the
>>>>>>>>>> config before it uses it, and this is the config
I'm getting:
>>>>>>>>>>> 
>>>>>>>>>>> {'disable_existing_loggers': False,
>>>>>>>>>>> 'formatters': {'airflow.processor': {'format':
'[%(asctime)s] '
>>>>>>>>>>> 
>>>>>>>>>> '{%(filename)s:%(lineno)d} '
>>>>>>>>>>>                                          '%(levelname)s
-
>>>>>>>>>> %(message)s'},
>>>>>>>>>>>          'airflow.task': {'format': '[%(asctime)s]
'
>>>>>>>>>>>                                     '{%(filename)s:%(lineno)d}
'
>>>>>>>>>>>                                     '%(levelname)s
-
>>>>>>>>>> %(message)s'}},
>>>>>>>>>>> 'handlers': {'console': {'class': 'logging.StreamHandler',
>>>>>>>>>>>                    'formatter': 'airflow.task',
>>>>>>>>>>>                    'stream': 'ext://sys.stdout'},
>>>>>>>>>>>        'file.processor': {'base_log_folder':
>>>>>>>>>> '/usr/local/airflow/logs/scheduler',
>>>>>>>>>>>                           'class': 'airflow.utils.log.file_
>>>>>>>>>> processor_handler.FileProcessorHandler',
>>>>>>>>>>>                           'filename_template':
'{{ filename
>>>>>>>>>> }}.log',
>>>>>>>>>>>                           'formatter': 'airflow.processor'},
>>>>>>>>>>>        'file.task': {'base_log_folder':
>>>>>>>> '/usr/local/airflow/logs/',
>>>>>>>>>>>                      'class': 'airflow.utils.log.file_task_
>>>>>>>>>> handler.FileTaskHandler',
>>>>>>>>>>>                      'filename_template': '{{
ti.dag_id }}/{{ '
>>>>>>>>>>>                                           'ti.task_id
}}/{{ ts
>>>>>>>>>> }}/{{ '
>>>>>>>>>>>                                           'try_number
}}.log',
>>>>>>>>>>>                      'formatter': 'airflow.task'}},
>>>>>>>>>>> 'loggers': {'': {'handlers': ['console'], 'level':
'INFO'},
>>>>>>>>>>>       'airflow': {'handlers': ['console'],
>>>>>>>>>>>                   'level': 'INFO',
>>>>>>>>>>>                   'propagate': False},
>>>>>>>>>>>       'airflow.processor': {'handlers': ['file.processor'],
>>>>>>>>>>>                             'level': 'INFO',
>>>>>>>>>>>                             'propagate': True},
>>>>>>>>>>>       'airflow.task': {'handlers': ['file.task'],
>>>>>>>>>>>                        'level': 'INFO',
>>>>>>>>>>>                        'propagate': False},
>>>>>>>>>>>       'airflow.task_runner': {'handlers': ['file.task'],
>>>>>>>>>>>                               'level': 'INFO',
>>>>>>>>>>>                               'propagate': True}},
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> If I edit run() in in bin/cli.py to print the
log.handlers under
>>> this
>>>>>>>>>> circumstance I get an empty list [].
>>>>>>>>>>> 
>>>>>>>>>>> I guess since we don't specify 'airflow.task.raw'
in the loggers
>>> that
>>>>>>>> it
>>>>>>>>>> should go up to the next level in 'airflow.task'?
Except it doesn't
>>>>>>>> seem to
>>>>>>>>>> be for me. I'm digging further.
>>>>>>>>>>> 
>>>>>>>>>>> -ash
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> On 30 Nov 2017, at 15:38, Bolke de Bruin
<bdbruin@gmail.com>
>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> airflow.task.raw points to airflow.task by
default, which does
>>> have
>>>>>> a
>>>>>>>>>> handler and that is the one that is travelled. And
yes we know the
>>>>>> code
>>>>>>>> is
>>>>>>>>>> incorrect in this area, but the default does work
correctly.
>>>>>>>>>>>> 
>>>>>>>>>>>> Bolke.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> On 30 Nov 2017, at 16:27, Ash Berlin-Taylor
<
>>>>>>>>>> ash_airflowlist@firemirror.com> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Okay I think the problem (if not the
cause) is around here:
>>>>>>>>>> https://github.com/apache/incubator-airflow/blob/master/
>>>>>>>>>> airflow/bin/cli.py#L366-L376
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> log = logging.getLogger('airflow.task')
>>>>>>>>>>>>> if args.raw:
>>>>>>>>>>>>> log = logging.getLogger('airflow.task.raw')
>>>>>>>>>>>>> 
>>>>>>>>>>>>> for handler in log.handlers:
>>>>>>>>>>>>> try:
>>>>>>>>>>>>> handler.set_context(ti)
>>>>>>>>>>>>> except AttributeError:
>>>>>>>>>>>>> # Not all handlers need to have context
passed in so we ignore
>>>>>>>>>>>>> # the error when handlers do not have
set_context defined.
>>>>>>>>>>>>> pass
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The problem seems to be that airflow.task.raw
is no longer
>>>>>> configured
>>>>>>>>>> explicitly, so it's log.handlers is an empty list.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> This in turn means that any log calls
end up not writing
>>> anything
>>>>>> as
>>>>>>>>>> our FileTaskHandler's self.handler property is None,
meaning it
>>>>>> doesn't
>>>>>>>>>> emit anything.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Does this ring any bells to anyone?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 30 Nov 2017, at 15:18, Ash Berlin-Taylor
<
>>>>>>>>>> ash_airflowlist@firemirror.com> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hmmm is anyone else seeing logs from
inside operators? Right
>>> now
>>>>>>>>>> we're not getting any logs out of operators. I'm
in the process of
>>>>>>>>>> debugging to work out if it's our logging config
or something in
>>>>>>>> Airflow,
>>>>>>>>>> but right now I don't seem to get any output from
the operator
>>> itself,
>>>>>>>>>> neither print(), or `ti.log.info()`
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Anyone else noticed similar, or is
it something specific to our
>>>>>>>>>> environment?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -ash
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 30 Nov 2017, at 09:52, Ash
Berlin-Taylor <
>>>>>>>>>> ash_airflowlist@firemirror.com> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> We're running more tests (day
of meetings meant we couldn't
>>> do as
>>>>>>>>>> many as I'd have liked yesterday) but looking good
so far.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> +0.5 (non binding)
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> -ash
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On 29 Nov 2017, at 18:31,
Chris Riccomini <
>>>>>> criccomini@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Anyone else? Last day before
I close the vote off.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Mon, Nov 27, 2017 at 12:32
PM, Driesprong, Fokko
>>>>>>>>>> <fokko@driesprong.frl>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Op ma 27 nov. 2017 om
21:24 schreef Bolke de Bruin <
>>>>>>>>>> bdbruin@gmail.com>
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> +1, binding
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Let’s get it out!
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On 27 Nov 2017,
at 20:32, Chris Riccomini <
>>>>>>>> criccomini@apache.org
>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Hey all,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I have cut Airflow
1.9.0 RC4. This email is calling a
>>> vote on
>>>>>>>> the
>>>>>>>>>>>>>>>>>> release,
>>>>>>>>>>>>>>>>>>> which will last
for 72 hours. Consider this my (binding)
>>> +1.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Airflow 1.9.0
RC4 is available at:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> https://dist.apache.org/repos/
>>> dist/dev/incubator/airflow/1.
>>>>>>>>>> 9.0rc4/
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> apache-airflow-1.9.0rc4+incubating-source.tar.gz
is a
>>> source
>>>>>>>>>> release
>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>> comes with INSTALL
instructions.
>>>>>>>>>>>>>>>>>>> apache-airflow-1.9.0rc4+incubating-bin.tar.gz
is the
>>> binary
>>>>>>>>>> Python
>>>>>>>>>>>>>>>>>> "sdist"
>>>>>>>>>>>>>>>>>>> release.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Public keys are
available at:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> https://dist.apache.org/repos/dist/release/incubator/
>>>>>> airflow/
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> The release contains
the following JIRAs:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> *https://issues.apache.org/jira/browse/AIRFLOW-1839
>>>>>>>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/AIRFLOW-1839>*
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Along with all
JIRAs that were in 1.9.0RC3 (see previous
>>> VOTE
>>>>>>>>>> email for
>>>>>>>>>>>>>>>>>>> full list).
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>> 
>>> 
> 


Mime
View raw message