arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wes McKinney (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ARROW-608) [Format] Days since epoch date type
Date Sun, 12 Mar 2017 15:55:04 GMT

     [ https://issues.apache.org/jira/browse/ARROW-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Wes McKinney updated ARROW-608:
-------------------------------
    Description: 
While we've decided to make the primary IPC date type be int64 milliseconds since the UNIX
epoch, in many libraries dates are represented as integer (int32, usually) days since some
epoch. In the Python standard library, the epoch is the year 0:

{code}
>>> d = datetime.date(2017, 1, 17)
>>> d.toordinal()
736346
>>> d.toordinal() / 365
2017
{code}

At least in Cplusplus-land, in working on ARROW-452 I ran into the problem of how to do zero-copy
reads of such data, while preserving the metadata to know that the values are dates. I added
a cpp-only "date32" type to support this use case https://github.com/apache/arrow/pull/365

I'm not sure whether we should add a new logical type, but thought it would be worth bringing
up in any case

  was:
While we've decided to make the primary IPC date type be int64 milliseconds since the UNIX
epoch, in many libraries dates are represented as integer (int32, usually) days since some
epoch. In the Python standard library, the epoch is the year 0:

{code}
>>> d = datetime.date(2017, 1, 17)
>>> d.toordinal()
736346
>>> d.toordinal() / 365
2017
{code}

At least in C++-land, in working on ARROW-452 I ran into the problem of how to do zero-copy
reads of such data, while preserving the metadata to know that the values are dates. I added
a C++-only "date32" type to support this use case https://github.com/apache/arrow/pull/365

I'm not sure whether we should add a new logical type, but thought it would be worth bringing
up in any case


> [Format] Days since epoch date type
> -----------------------------------
>
>                 Key: ARROW-608
>                 URL: https://issues.apache.org/jira/browse/ARROW-608
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Format
>            Reporter: Wes McKinney
>
> While we've decided to make the primary IPC date type be int64 milliseconds since the
UNIX epoch, in many libraries dates are represented as integer (int32, usually) days since
some epoch. In the Python standard library, the epoch is the year 0:
> {code}
> >>> d = datetime.date(2017, 1, 17)
> >>> d.toordinal()
> 736346
> >>> d.toordinal() / 365
> 2017
> {code}
> At least in Cplusplus-land, in working on ARROW-452 I ran into the problem of how to
do zero-copy reads of such data, while preserving the metadata to know that the values are
dates. I added a cpp-only "date32" type to support this use case https://github.com/apache/arrow/pull/365
> I'm not sure whether we should add a new logical type, but thought it would be worth
bringing up in any case



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message