airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Tyukin <>
Subject parsing task instance log files
Date Thu, 09 Feb 2017 13:35:20 GMT

I am using HiveCliHook called from PythonOperator to run a series of
queries and want to capture record counts for auditing and validation

*I am thinking to use on_success_callback to invoke python function that
will read the log file, produced by airflow and then parse it out using
regex. *

*I am going to use this method from models to get to the file log:*

*def log_filepath(self): iso = self.execution_date.isoformat() log =
os.path.expanduser(configuration.get('core', 'BASE_LOG_FOLDER')) return (
Is this a good strategy or there is an easier way? I wondering if someone
did something similar.

Another challenge is that the same log file contains multiple attempts and
reruns of the same task so I guess I need to parse the file backwards.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message