airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Artiom (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-1144) Logging causes UnicodeEncodeError when using Japanese characters
Date Wed, 05 Jul 2017 15:30:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074939#comment-16074939
] 

Artiom commented on AIRFLOW-1144:
---------------------------------

The same problem is breaking my DAGs after switching from 1.7.1.3
The problem appears on my vagrant default ubuntu 16.04 machine. The output of locale command.

LANG=en_US.UTF-8
LANGUAGE=en_US:
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

To replicate I created a DAG with single bash operator task that runs 'download.sh'
The code for download.sh is pretty simple:
wget ftp://anonymous:guest@ftp.debian.org/debian/README.mirrors.txt

It breaks on the first backquote. 

Jul 05 15:27:00 vagrant airflow[29929]: Exception in thread Thread-1:
Jul 05 15:27:00 vagrant airflow[29929]: Traceback (most recent call last):
Jul 05 15:27:00 vagrant airflow[29929]:   File "/usr/lib/python2.7/threading.py", line 801,
in __bootstrap_inner
Jul 05 15:27:00 vagrant airflow[29929]:     self.run()
Jul 05 15:27:00 vagrant airflow[29929]:   File "/usr/lib/python2.7/threading.py", line 754,
in run
Jul 05 15:27:00 vagrant airflow[29929]:     self.__target(*self.__args, **self.__kwargs)
Jul 05 15:27:00 vagrant airflow[29929]:   File "/var/lib/airflow/venv/local/lib/python2.7/site-packages/airflow/task_runner/base_task_runner.py",
line 95, in _read_task_logs
Jul 05 15:27:00 vagrant airflow[29929]:     self.logger.info('Subtask: {}'.format(line.rstrip('\n')))
Jul 05 15:27:00 vagrant airflow[29929]: UnicodeEncodeError: 'ascii' codec can't encode character
u'\u2018' in position 58: ordinal not in range(128)

Any suggestions?

> Logging causes UnicodeEncodeError when using Japanese characters
> ----------------------------------------------------------------
>
>                 Key: AIRFLOW-1144
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1144
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: logging, worker
>    Affects Versions: 1.8.0
>            Reporter: Sushant Karki
>
> I am using the bash operator to pipe a sql dump to my database. Since, the encoding of
my psql client is Japanese, the output displays some Japanese characters. Whenever the logger
tries to log the output, it raises a UnicodeEncodeError.
> Here are the details of the error.
> {code}
> Exception in thread Thread-1:
> Traceback (most recent call last):
>   File "/usr/lib64/python2.7/threading.py", line 813, in __bootstrap_inner
>     self.run()
>   File "/usr/lib64/python2.7/threading.py", line 766, in run
>     self.__target(*self.__args, **self.__kwargs)
>   File "/home/karki/virtualenv/master/local/lib/python2.7/site-packages/airflow/task_runner/base_task_runner.py",
line 95, in _read_task_logs
>     self.logger.info('Subtask: {}'.format(line.rstrip('\n')))
> UnicodeEncodeError: 'ascii' codec can't encode character u'\u884c' in position 58: ordinal
not in range(128)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message