airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Riccomini <criccom...@apache.org>
Subject Re: Using s3 logging in Airflow 1.9.x
Date Mon, 09 Oct 2017 17:39:39 GMT
Have a look at this:

https://github.com/apache/incubator-airflow/pull/2671

I had to do a similar dance.


On Mon, Oct 9, 2017 at 10:28 AM, David Klosowski <davidk@thinknear.com>
wrote:

> Out of curiosity, has anyone had the S3 logging feature work that has
> tested it out in the latest 1.9 branch of Airflow?
>
> I'm imagining that anyone with a distributed container environment with
> Airflow will want to be able to view the task logs in the UI, and w/o this
> feature it won't work as you can't share the task logs at the host level
> when the containers are distributed.
>
> Thanks.
>
> Regards,
> David
>
> On Sat, Oct 7, 2017 at 10:06 AM, David Klosowski <davidk@thinknear.com>
> wrote:
>
> > Hi Ash,
> >
> > Thanks for the response .
> >
> > I neglected to post that I do in fact have that config specified in the
> > airflow.cfg
> >
> > My file logger shows the custom formatting I have set and I see a log
> line
> > in the console from the scheduler:
> >
> > {logging_config.py:42} INFO - Successfully imported user-defined logging
> > config from
> >
> > Cheers,
> > David
> >
> >
> > On Sat, Oct 7, 2017 at 7:42 AM, Ash Berlin-Taylor <
> > ash_airflowlist@firemirror.com> wrote:
> >
> >> It could be that you have created a custom logging file, but you haven't
> >> specified it in your airflow.cfg:
> >>
> >> ```
> >> logging_config_class=mymodule.LOGGING_CONFIG
> >> ```
> >>
> >>
> >>
> >>
> >> > On 7 Oct 2017, at 00:11, David Klosowski <davidk@thinknear.com>
> wrote:
> >> >
> >> > Hey Airflow Devs:
> >> >
> >> > How is s3 logging supposed to work in Airflow 1.9.0?
> >> >
> >> > I've followed the *UPDATING.md* guide for the new setup of logging and
> >> > while I can use my custom logging configuration module to format the
> >> files
> >> > written to the host, the s3 logging doesn't appear to work as I don't
> >> see
> >> > anything in s3.
> >> >
> >> > *> airflow.cfg*
> >> >
> >> > task_log_reader = s3.task
> >> >
> >> >
> >> > *> custom logging module added to PYTHONPATH with __init__.py in
> >> directory*
> >> > ----
> >> > import os
> >> >
> >> > from airflow import configuration as conf
> >> >
> >> > LOG_LEVEL = conf.get('core', 'LOGGING_LEVEL').upper()
> >> > LOG_FORMAT = conf.get('core', 'log_format')
> >> >
> >> > BASE_LOG_FOLDER = conf.get('core', 'BASE_LOG_FOLDER')
> >> >
> >> > FILENAME_TEMPLATE = '{{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{
> >> > try_number }}.log'
> >> >
> >> > STAGE = os.getenv('STAGE')
> >> >
> >> > S3_LOG_FOLDER = 's3://tn-testing-bucket/dk
> >> >
> >> > LOGGING_CONFIG = {
> >> >    'version': 1,
> >> >    'disable_existing_loggers': False,
> >> >    'formatters': {
> >> >        'airflow.task': {
> >> >            'format': LOG_FORMAT,
> >> >        },
> >> >    },
> >> >    'handlers': {
> >> >        'console': {
> >> >            'class': 'logging.StreamHandler',
> >> >            'formatter': 'airflow.task',
> >> >            'stream': 'ext://sys.stdout'
> >> >        },
> >> >        'file.task': {
> >> >            'class': 'airflow.utils.log.file_task_h
> >> andler.FileTaskHandler',
> >> >            'formatter': 'airflow.task',
> >> >            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
> >> >            'filename_template': FILENAME_TEMPLATE,
> >> >        },
> >> >        # When using s3 or gcs, provide a customized LOGGING_CONFIG
> >> >        # in airflow_local_settings within your PYTHONPATH, see
> >> UPDATING.md
> >> >        # for details
> >> >        's3.task': {
> >> >            'class': 'airflow.utils.log.s3_task_
> handler.S3TaskHandler',
> >> >            'formatter': 'airflow.task',
> >> >            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
> >> >            's3_log_folder': S3_LOG_FOLDER,
> >> >            'filename_template': FILENAME_TEMPLATE,
> >> >        }
> >> >    },
> >> >    'loggers': {
> >> >        'airflow.task': {
> >> >            'handlers': ['s3.task'],
> >> >            'level': LOG_LEVEL,
> >> >            'propagate': False,
> >> >        },
> >> >        'airflow.task_runner': {
> >> >            'handlers': ['s3.task'],
> >> >            'level': LOG_LEVEL,
> >> >            'propagate': True,
> >> >        },
> >> >        'airflow': {
> >> >            'handlers': ['console'],
> >> >            'level': LOG_LEVEL,
> >> >            'propagate': False,
> >> >        },
> >> >    }
> >> > }
> >> > -----
> >> >
> >> > I never see any task logs in S3, even after completion of all tasks.
> >> >
> >> > While running this in docker since I my executors/workers are on
> >> different
> >> > hosts, when I try to pull up the task logs in the UI I receive the
> >> > following error b/c they there in s3:
> >> >
> >> > Failed to fetch log file from worker.
> >> > HTTPConnectionPool(host='f0cf9e596af6', port=8793): Max retries
> >> > exceeded with url:
> >> >
> >> >
> >> >
> >> > Any additional hints you can provide on what else needs to be done?
> >> >
> >> > Thanks.
> >> >
> >> > Regards,
> >> > David
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message