arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau (JIRA)" <>
Subject [jira] [Commented] (ARROW-62) Format: Are the nulls bits 0 or 1 for null values?
Date Sun, 13 Mar 2016 20:34:33 GMT


Jacques Nadeau commented on ARROW-62:

I consider the bitmap to be a validity map as opposed to a null map. I've also seen a couple
places where it is nice to zero out values that are null using the zero in the bitmap without
a condition... although I can't remember where we took advantage of this previously.

> Format: Are the nulls bits 0 or 1 for null values?
> --------------------------------------------------
>                 Key: ARROW-62
>                 URL:
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Format
>            Reporter: Wes McKinney
>            Assignee: Wes McKinney
> As brought up by Dan Robinson on the mailing list (thank you for catching this!), there
is an inconsistency in the format documents in the representation of nulls with the ValueVectors
code import -- since I drafted these format documents initially I'll take the blame for the
inconsistency, but:
> * Drill / ValueVectors uses the value 0 for null data, and 1 for non-null data
> * The format document currently states the opposite (values are null if the bit is set)
> I can see arguments both ways, but one argument for the ValueVectors style is that values
must be explicitly set to be non-null, versus uninitialized values being accidentally interpreted
as being non-null. When initializing a bitmap, one can {{memset}} the bits to 0, then set
then to 1 when non-null values are appended during construction.

This message was sent by Atlassian JIRA

View raw message