spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Earthson Lu (JIRA)" <>
Subject [jira] [Commented] (SPARK-12746) ArrayType(_, true) should also accept ArrayType(_, false)
Date Thu, 14 Jan 2016 06:57:39 GMT


Earthson Lu commented on SPARK-12746:

ok, i see:)

If there's no nullability in ML, how could we implement a Transformer to fill missing values(always
represented as NULL). I think we need support nullability for Preprocessing, so we can get
clean data for further operation. I can't imagine the situation that we can do nothing when
the data contains NULL.

- - -

I think the type checking API is independent with nullability in ML. It is a common case that
one transformer accept both BooleanType or IntType. Maybe, it is a good idea that test condition
and assertions are implemented separately.

> ArrayType(_, true) should also accept ArrayType(_, false)
> ---------------------------------------------------------
>                 Key: SPARK-12746
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: ML, SQL
>    Affects Versions: 1.6.0
>            Reporter: Earthson Lu
> I see CountVectorizer has schema check for ArrayType which has ArrayType(StringType,
> ArrayType(String, false) is just a special case of ArrayType(String, true), but it will
not pass this type check.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message