spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andre Schumacher (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-1649) Figure out Nullability semantics for Array elements and Map values
Date Sat, 10 May 2014 22:04:48 GMT

    [ https://issues.apache.org/jira/browse/SPARK-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994194#comment-13994194
] 

Andre Schumacher commented on SPARK-1649:
-----------------------------------------

Sorry for the delay. OK, not allowing null values inside Parquet arrays would seem the cleanest
solution for now until this issue has been thought through.

> Figure out Nullability semantics for Array elements and Map values
> ------------------------------------------------------------------
>
>                 Key: SPARK-1649
>                 URL: https://issues.apache.org/jira/browse/SPARK-1649
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.1.0
>            Reporter: Andre Schumacher
>            Priority: Critical
>
> For the underlying storage layer it would simplify things such as schema conversions,
predicate filter determination and such to record in the data type itself whether a column
can be nullable. So the DataType type could look like like this:
> abstract class DataType(nullable: Boolean = true)
> Concrete subclasses could then override the nullable val. Mostly this could be left as
the default but when types can be contained in nested types one could optimize for, e.g.,
arrays with elements that are nullable and those that are not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message