arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Clancy <n...@achren.org>
Subject Re: [javascript] cant get timestamps in arrow 2.0
Date Thu, 17 Dec 2020 21:09:19 GMT
Yep - that's where I was expecting it!
These guys appear to implement decompression using pako:
https://github.com/usnistgov/jsfive - might be a good route to look into.



On Thu, 17 Dec 2020 at 19:19, Micah Kornfield <emkornfield@gmail.com> wrote:

> I don't know the support for the compression codecs in Javascript, but i
> don't think anyone has attempted to implement them.
>
> I couldn't find the compression feature listed on the library status docs
> [1].
>
> But we should add a line item for it.  Today, I think only C++ (and
> libraries that bind to it) have compression implemented.  I think a new PR
> for java was just opened in the last few days.
>
> [1] https://arrow.apache.org/docs/status.html
>
> On Thu, Dec 17, 2020 at 10:10 AM Andrew Clancy <nite@achren.org> wrote:
>
>> So, I figured out the issue here - I had to remove compression from the
>> pyarrow feather.write_feather(compression='uncompressed'). Is there any
>> way to read a compressed feather file in arrow js?
>> See the comment under the first answer here:
>> https://stackoverflow.com/questions/64629670/how-to-write-a-pandas-dataframe-to-arrow-file/64648955#64648955
>> I couldn't find anything in the arrow docs or notebooks on this - I'm
>> assuming that's related to javascript compression libraries being so
>> limited.
>>
>>
>> On Mon, 14 Dec 2020 at 21:32, Andrew Clancy <nite@achren.org> wrote:
>>
>>> Hi all,
>>>
>>> I have a simple feather file created via a pandas to_feather with a
>>> datetime64[ns] column, and cannot get timestamps in javascript
>>> apache-arrow@2.0.0
>>>
>>> See this notebook:
>>> https://observablehq.com/@nite/apache-arrow-timestamp-investigation
>>>
>>> I'm guessing I'm missing something, has anyone got any suggestions, or
>>> decent examples of reading a file created in pandas? I've seen in examples
>>> of apache-arrow@0.3.1 where dates stored as an array of 2 ints.
>>>
>>> File was created with:
>>>
>>> import pandas as pd
>>> pd.read_parquet('sample.parquet')
>>> df.to_feather('sample-seconds.feather')
>>>
>>> Final Q: I'm assuming this is the best place for this question? Happy to
>>> post elsewhere if there's any other forums, or if this should be a JIRA
>>> ticket?
>>>
>>> Thanks!
>>> Andy
>>>
>>

Mime
View raw message