Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/234#discussion_r168525141
--- Diff: src/ports/postgres/modules/utilities/encode_categorical.py_in ---
@@ -317,7 +317,19 @@ class CategoricalEncoder(object):
if self.output_type not in ('array', 'svec'):
if not self._output_dictionary:
- value_names = {None: 'NULL',
+ ## MADLIB-1202
+ ## In postgres, boolean variables are always saved
+ ## as 'True', 'False' with the first letter as capital,
+ ## which will cause the generated column name as
+ ## <boolean column name>_True/False that needs double
+ ## quoting to query. To make it more convnient to user,
+ ## we cast them to lower case true/false so that the
+ ## generated column name is <boolean column name>_true/false
+ ## The same logic applied to _null and _misc strs
+ if v in ('True', 'False'):
--- End diff --
Wouldn't this be better off as `if isinstance(v, bool)`?
---
|