arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joris Van den Bossche (Jira)" <j...@apache.org>
Subject [jira] [Created] (ARROW-6749) [Python] Conversion of non-ns timestamp array to numpy gives wrong values
Date Tue, 01 Oct 2019 10:08:00 GMT
Joris Van den Bossche created ARROW-6749:
--------------------------------------------

             Summary: [Python] Conversion of non-ns timestamp array to numpy gives wrong values
                 Key: ARROW-6749
                 URL: https://issues.apache.org/jira/browse/ARROW-6749
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
            Reporter: Joris Van den Bossche


{code}
In [25]: np_arr = np.arange("2012-01-01", "2012-01-06", int(1e6)*60*60*24, dtype="datetime64[us]")
                                                                                         
                      

In [26]: np_arr                                                                          
                                                                                         
                               
Out[26]: 
array(['2012-01-01T00:00:00.000000', '2012-01-02T00:00:00.000000',
       '2012-01-03T00:00:00.000000', '2012-01-04T00:00:00.000000',
       '2012-01-05T00:00:00.000000'], dtype='datetime64[us]')

In [27]: arr = pa.array(np_arr)                                                          
                                                                                         
                               

In [28]: arr                                                                             
                                                                                         
                               
Out[28]: 
<pyarrow.lib.TimestampArray object at 0x7f0b2ef07ee8>
[
  2012-01-01 00:00:00.000000,
  2012-01-02 00:00:00.000000,
  2012-01-03 00:00:00.000000,
  2012-01-04 00:00:00.000000,
  2012-01-05 00:00:00.000000
]

In [29]: arr.type                                                                        
                                                                                         
                               
Out[29]: TimestampType(timestamp[us])

In [30]: arr.to_numpy()                                                                  
                                                                                         
                               
Out[30]: 
array(['1970-01-16T08:09:36.000000000', '1970-01-16T08:11:02.400000000',
       '1970-01-16T08:12:28.800000000', '1970-01-16T08:13:55.200000000',
       '1970-01-16T08:15:21.600000000'], dtype='datetime64[ns]')
{code}

So it seems to simply interpret the integer microsecond values as nanoseconds when converting
to numpy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message