arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject [RESULT] [VOTE] Add 64-bit offset list, binary, string (utf8) types to the Arrow columnar format
Date Wed, 01 May 2019 14:05:55 GMT
The vote carries with 3 binding +1 and 2 non-binding +1

On Fri, Apr 26, 2019 at 10:05 AM Brian Bowman <Brian.Bowman@sas.com> wrote:
>
> Can non-Arrow PMC members/committers vote?
>
> If so, +1
>
> -Brian
>
> ´╗┐On 4/25/19, 4:34 PM, "Wes McKinney" <wesmckinn@gmail.com> wrote:
>
>     EXTERNAL
>
>     In a recent mailing list discussion [1] Micah Kornfield has proposed
>     to add new list and variable-size binary and unicode types to the
>     Arrow columnar format with 64-bit signed integer offsets, to be used
>     in addition to the existing 32-bit offset varieties. These will be
>     implemented as new types in the Type union in Schema.fbs (the
>     particular names can be debated in the PR that implements them):
>
>     LargeList
>     LargeBinary
>     LargeString [UTF8]
>
>     While very large contiguous columns are not a principle use case for
>     the columnar format, it has been observed empirically that there are
>     applications that use the format to represent datasets where
>     realizations of data can sometimes exceed the 2^31 - 1 "capacity" of a
>     column and cannot be easily (or at all) split into smaller chunks.
>
>     Please vote whether to accept the changes. The vote will be open for at
>     least 72 hours.
>
>     [ ] +1 Accept the additions to the columnar format
>     [ ] +0
>     [ ] -1 Do not accept the changes because...
>
>     Thanks,
>     Wes
>
>     [1]: https://lists.apache.org/thread.html/8088eca21b53906315e2bbc35eb2d246acf10025b5457eccc7a0e8a3@%3Cdev.arrow.apache.org%3E
>
>

Mime
View raw message