airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Driesprong, Fokko" <fo...@driesprong.frl>
Subject Re: question regarding Spark SQL operator (hook) logging in Airflow
Date Wed, 07 Mar 2018 09:03:15 GMT
Hi Aleksander,

What version of Airflow are you using?

Like mentioned in the ticket, the problem was as follows. In the old
situation the first buffer was first consumed (STDOUT) and the second
buffer was consumed after the STDOUT was exhausted. This was problematic
because the second buffer (STDERR) was filling up, and when it would reach
the max, it would block. Therefore we pipe the STDERR to the STDOUT and
just consume the STDOUT. Does this make sense? Please share the logs of the
SparkSqlOperator job.

Cheers, Fokko






2018-03-07 9:57 GMT+01:00 Aleksander Sabitov <sabitov.ay@gmail.com>:

> HI Fokko!
> May be by chance you can advice something. I have small issues while using
> Spark SQL operator in Airflow. Most probably it's connected to the fact
> that I use Docker to run Airflow. Issue is that when Spark SQL job failed
> or succeeded (regardless) - logging loop still consuming continuously empty
> byte strings. Before last commits for Spark SQL hook it was reading byte
> strings and decoding them "on the fly" - https://github.com/apache/
> incubator-airflow/commit/32750601ad0a422283613bf7fccff8eb5407bc9c#diff-
> 16c0ecc7c4b60bfe6e66592bb70e17cf
> I think my issue can be connected somehow to the fact of both Docker usage
> and byte strings.
> Any ideas?
>
> Thanks in advance!
>
> Aleksandr
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message