airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Tyukin <>
Subject Re: how to capture sqoop mapreduce counters
Date Wed, 25 Jan 2017 21:25:59 GMT
I figured that luckily for me, the number of rows loaded by sqoop is
reported to stdout as the very last line. So I just used BashOperator and
set xcom_push=True. Then I did something like that:

    # Log row_count ingested
        row_count = int('Retrieved (\d+) records',
"rows_ingested_sqoop", row_count)
    except ValueError:
"rows_ingested_sqoop", -1)

The alternative I was considering is to get mapreduce jobid and then use
mapred command to get the needed counter - here is an example:

mapred job -counter job_1484574566480_0002

But I could not figure out an easy way to get job_id from BashOperator /
sqoop output. I guess I could create my own operator that would capture all
stdout lines not only the last one.

On Tue, Jan 24, 2017 at 9:07 AM, Boris Tyukin <> wrote:

> Hello all,
> is there a way to capture sqoop counters either using bash or sqoop
> operator? Specifically I need to pull a total number of rows loaded.
> By looking at bash operator, I think there is an option to push the last
> line of output to xcom but sqoop and mapreduce output is a bit more
> complicated.
> Thanks!

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message