airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bolke de Bruin <bdbr...@gmail.com>
Subject Re: new logging
Date Fri, 17 Nov 2017 18:47:16 GMT
self.log.info(“xxx”)

Works in every other operator, so please provide code in order to help you.

Bolke.

> On 17 Nov 2017, at 19:25, Kevin Lam <kevin@fathomhealth.co> wrote:
> 
> Sorry for spam! To clarify again, the expected behaviour is that the
> logging.info calls in the custom operator would be forwarded to the
> base_task_runner and then be available for viewing in the task logs in the
> Airflow Web UI.
> 
> On Fri, Nov 17, 2017 at 1:20 PM, Kevin Lam <kevin@fathomhealth.co> wrote:
> 
>> Note that in 1.8 the behaviour is as expected. Thanks again!
>> 
>> On Fri, Nov 17, 2017 at 1:17 PM, kevin@fathomhealth.co <
>> kevin@fathomhealth.co> wrote:
>> 
>>> Hi,
>>> 
>>> I'm encountering a seemingly related issue in 1.9.
>>> 
>>> I have a custom Airflow Operator that I want to log from and none of the
>>> outputs seems to get forwarded to the Task Logs in the Airflow Web UI
>>> except for when I use STDOUT via print().
>>> 
>>> My custom operator inherits from the BashOperator and I've tried logging
>>> the following other ways:
>>> 
>>> - using self.log passed from BaseOperator
>>> -  initializing the logger with log = logging.getLogger(__name__)
>>> - making sure to set the logging level in a airflow.cfg
>>> 
>>> Do you know if there is something we should be otherwise doing?
>>> 
>>> Thanks in advance for your help!
>>> 
>>> On 2017-10-18 07:00, Boris <boriskey@gmail.com> wrote:
>>>> Got it, thanks Bolke
>>>> 
>>>> On Oct 18, 2017 2:21 AM, "Bolke de Bruin" <bdbruin@gmail.com> wrote:
>>>> 
>>>>> You should always init the logger. It is bad practice not to do it.
>>>>> 
>>>>> Log = logging.getLogger(__name__)
>>>>> 
>>>>> Is the canonical way. In Airflow we provide some convenience
>>> functions for
>>>>> operators etc. The loggingmixin can help out.
>>>>> 
>>>>> Bolke
>>>>> 
>>>>> Verstuurd vanaf mijn iPad
>>>>> 
>>>>>> Op 17 okt. 2017 om 20:21 heeft Boris Tyukin <boris@boristyukin.com>
>>> het
>>>>> volgende geschreven:
>>>>>> 
>>>>>> thank you both! so my understanding instead one liner like
>>>>>> logging.info("something")
>>>>>> 
>>>>>> I need to init logger first in a python operator and only after
>>> that I
>>>>> can
>>>>>> invoke logger.info like below. This is a bit unexpected and a bit
>>> more
>>>>> work
>>>>>> but it does work now - thanks. Maybe I was not doing it right in
the
>>>>> first
>>>>>> place but hey it did work in 1.8 :)
>>>>>> 
>>>>>> logger = logging.getLogger('airflow.python_test_logging')
>>>>>> logger.info('hi from loggin info')
>>>>>> 
>>>>>> On Tue, Oct 17, 2017 at 6:35 AM, Driesprong, Fokko
>>> <fokko@driesprong.frl
>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi Boris,
>>>>>>> 
>>>>>>> This is interesting.
>>>>>>> 
>>>>>>> When I run the example you provide with the latest Airflow
>>> release, I
>>>>> get:
>>>>>>> root@9e3cf03c0544:~# airflow test python_test_logging
>>> print_the_context
>>>>>>> 2017-01-01
>>>>>>> [2017-10-17 10:27:38,321] {__init__.py:45} INFO - Using executor
>>>>>>> SequentialExecutor
>>>>>>> [2017-10-17 10:27:38,359] {models.py:186} INFO - Filling up the
>>> DagBag
>>>>> from
>>>>>>> /root/airflow/dags
>>>>>>> [2017-10-17 10:27:38,406] {dag.py:30} WARNING - test warn
>>>>>>> [2017-10-17 10:27:38,406] {dag.py:31} INFO - test info
>>>>>>> [2017-10-17 10:27:38,429] {models.py:1165} INFO - Dependencies
all
>>> met
>>>>> for
>>>>>>> <TaskInstance: python_test_logging.print_the_context 2017-01-01
>>>>> 00:00:00
>>>>>>> [None]>
>>>>>>> [2017-10-17 10:27:38,432] {models.py:1165} INFO - Dependencies
all
>>> met
>>>>> for
>>>>>>> <TaskInstance: python_test_logging.print_the_context 2017-01-01
>>>>> 00:00:00
>>>>>>> [None]>
>>>>>>> [2017-10-17 10:27:38,432] {models.py:1375} INFO -
>>>>>>> ------------------------------------------------------------
>>>>>>> --------------------
>>>>>>> Starting attempt 1 of 1
>>>>>>> ------------------------------------------------------------
>>>>>>> --------------------
>>>>>>> 
>>>>>>> [2017-10-17 10:27:38,433] {models.py:1396} INFO - Executing
>>>>>>> <Task(PythonOperator): print_the_context> on 2017-01-01
00:00:00
>>>>>>> {'END_DATE': '2017-01-01',
>>>>>>> 'conf': <module 'airflow.configuration' from
>>>>>>> '/usr/local/lib/python3.6/site-packages/airflow/configuratio
>>> n.py'>,
>>>>>>> 'dag': <DAG: python_test_logging>,
>>>>>>> 'dag_run': None,
>>>>>>> 'ds_nodash': '20170101',
>>>>>>> 'end_date': '2017-01-01',
>>>>>>> 'execution_date': datetime.datetime(2017, 1, 1, 0, 0),
>>>>>>> 'latest_date': '2017-01-01',
>>>>>>> 'macros': <module 'airflow.macros' from
>>>>>>> '/usr/local/lib/python3.6/site-packages/airflow/macros/__
>>> init__.py'>,
>>>>>>> 'next_execution_date': None,
>>>>>>> 'params': {},
>>>>>>> 'prev_execution_date': None,
>>>>>>> 'run_id': None,
>>>>>>> 'tables': None,
>>>>>>> 'task': <Task(PythonOperator): print_the_context>,
>>>>>>> 'task_instance': <TaskInstance: python_test_logging.print_the_
>>> context
>>>>>>> 2017-01-01 00:00:00 [None]>,
>>>>>>> 'task_instance_key_str':
>>>>>>> 'python_test_logging__print_the_context__20170101',
>>>>>>> 'templates_dict': None,
>>>>>>> 'test_mode': True,
>>>>>>> 'ti': <TaskInstance: python_test_logging.print_the_context
>>> 2017-01-01
>>>>>>> 00:00:00 [None]>,
>>>>>>> 'tomorrow_ds': '2017-01-02',
>>>>>>> 'tomorrow_ds_nodash': '20170102',
>>>>>>> 'ts': '2017-01-01T00:00:00',
>>>>>>> 'ts_nodash': '20170101T000000',
>>>>>>> 'var': {'json': None, 'value': None},
>>>>>>> 'yesterday_ds': '2016-12-31',
>>>>>>> 'yesterday_ds_nodash': '20161231'}
>>>>>>> hi from print
>>>>>>> [2017-10-17 10:27:38,441] {python_operator.py:90} INFO - Done.
>>> Returned
>>>>>>> value was: None
>>>>>>> 
>>>>>>> When I change it to `warn`, I do get:
>>>>>>> WARNING:root:hi from loggin info
>>>>>>> 
>>>>>>> ​​By giving an explicit logger, as Daniel suggested, that
starts
>>> with
>>>>>>> airflow%:
>>>>>>> def print_context(ds, **kwargs):
>>>>>>>   pprint(kwargs)
>>>>>>>   print('hi from print')
>>>>>>>   logger = logging.getLogger('airflow.python_test_logging')
>>>>>>>   logger.info('hi from loggin info')
>>>>>>> 
>>>>>>> ​Then it is picked up by the Airflow logging context​:
>>>>>>> [2017-10-17 10:31:05,639] {dag.py:23} INFO - hi from loggin info
>>>>>>> 
>>>>>>> ​This is how the current logging is set:
>>>>>>> https://github.com/apache/incubator-airflow/blob/master/
>>>>>>> airflow/config_templates/airflow_local_settings.py
>>>>>>> 
>>>>>>> Currently I'm investigating why it used to work in Airflow 1.8.
>>> This is
>>>>> not
>>>>>>> yet clear to me.
>>>>>>> 
>>>>>>> Cheers, Fokko​
>>>>>>> 
>>>>>>> 
>>>>>>> 2017-10-17 7:58 GMT+02:00 Daniel Lamblin [Data Science &
Platform
>>>>> Center] <
>>>>>>> lamblin@coupang.com>:
>>>>>>> 
>>>>>>>> Boris, I don't see where you configured the default logger
>>>>>>>> <https://docs.python.org/2/howto/logging.html#configuring-logging>.
>>> I
>>>>>>>> think
>>>>>>>> you'd have to at least throw in:
>>>>>>>>   logger = logging.getLogger('python_test_logging')
>>>>>>>> and then use the logger.info(…) instead of logging.info(…)
>>>>>>>> 
>>>>>>>> On Tue, Oct 17, 2017 at 3:00 AM, Boris Tyukin <
>>> boris@boristyukin.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> sorry i was not clear.
>>>>>>>>> 
>>>>>>>>> In DAG definition file, I would normally import logging
and when
>>> use
>>>>>>> with
>>>>>>>>> python operator like below (see print_context).
>>>>>>>>> 
>>>>>>>>> In 1.8.2 i would see in Airflow log file (for that task)
both
>>> 'hi from
>>>>>>>>> print' and 'hi from logging'.
>>>>>>>>> 
>>>>>>>>> now I can only see 'hi from print'.
>>>>>>>>> 
>>>>>>>>> I installed from master (pip install), and used newer
airflow.cfg
>>>>> file.
>>>>>>>>> Only updated it to use local executor and mysql db. Did
not
>>> change any
>>>>>>>>> other settings.
>>>>>>>>> 
>>>>>>>>> So my question if it is supposed to work like that and
how do I
>>> log
>>>>> now
>>>>>>>>> from python operators in a dag.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> from __future__ import print_function
>>>>>>>>> from builtins import range
>>>>>>>>> import airflow
>>>>>>>>> from airflow.operators.python_operator import PythonOperator
>>>>>>>>> from airflow.models import DAG
>>>>>>>>> import logging
>>>>>>>>> import time
>>>>>>>>> from pprint import pprint
>>>>>>>>> 
>>>>>>>>> args = {
>>>>>>>>>   'owner': 'airflow',
>>>>>>>>>   'start_date': airflow.utils.dates.days_ago(2)
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> dag = DAG(
>>>>>>>>>   dag_id='python_test_logging', default_args=args,
>>>>>>>>>   schedule_interval=None)
>>>>>>>>> 
>>>>>>>>> def print_context(ds, **kwargs):
>>>>>>>>>   pprint(kwargs)
>>>>>>>>>   print('hi from print')
>>>>>>>>>   logging.info('hi from loggin info')
>>>>>>>>> 
>>>>>>>>> run_this = PythonOperator(
>>>>>>>>>   task_id='print_the_context',
>>>>>>>>>   provide_context=True,
>>>>>>>>>   python_callable=print_context,
>>>>>>>>>   dag=dag)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Mon, Oct 16, 2017 at 1:09 PM, Driesprong, Fokko
>>>>>>> <fokko@driesprong.frl
>>>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi Boris,
>>>>>>>>>> 
>>>>>>>>>> Can you please elaborate? Where did the output used
to end up?
>>> Did
>>>>>>> you
>>>>>>>>>> provide any custom logging config, or are you using
the default
>>>>>>> config?
>>>>>>>>>> Maybe drop some example code.
>>>>>>>>>> 
>>>>>>>>>> Cheers, Fokko
>>>>>>>>>> 
>>>>>>>>>> 2017-10-16 19:02 GMT+02:00 Boris <boriskey@gmail.com>:
>>>>>>>>>> 
>>>>>>>>>>> Hi guys,
>>>>>>>>>>> 
>>>>>>>>>>> I used to do logging.info("somthing") from airflow
python
>>>>>>> operators
>>>>>>>>> so i
>>>>>>>>>>> can see output in airflow logs. Worked fine in
1.8.2.
>>>>>>>>>>> 
>>>>>>>>>>> Looks like this is no longer the case with master
branch.I did
>>> look
>>>>>>>> at
>>>>>>>>>>> updating.md but still not clear how to log custom
messages
>>> from
>>>>>>>> python
>>>>>>>>>>> operators.
>>>>>>>>>>> 
>>>>>>>>>>> please advise.
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> -Daniel Lamblin
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 


Mime
View raw message