From user-return-858-archive-asf-public=cust-asf.ponee.io@arrow.apache.org Fri Dec 18 04:26:04 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-he-de.apache.org (mxout1-he-de.apache.org [95.216.194.37]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id CB04C180608 for ; Fri, 18 Dec 2020 05:26:04 +0100 (CET) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-he-de.apache.org (ASF Mail Server at mxout1-he-de.apache.org) with SMTP id 085DB67713 for ; Fri, 18 Dec 2020 04:26:03 +0000 (UTC) Received: (qmail 55625 invoked by uid 500); 18 Dec 2020 04:26:02 -0000 Mailing-List: contact user-help@arrow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@arrow.apache.org Delivered-To: mailing list user@arrow.apache.org Received: (qmail 55613 invoked by uid 99); 18 Dec 2020 04:26:01 -0000 Received: from spamproc1-he-de.apache.org (HELO spamproc1-he-de.apache.org) (116.203.196.100) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Dec 2020 04:26:01 +0000 Received: from localhost (localhost [127.0.0.1]) by spamproc1-he-de.apache.org (ASF Mail Server at spamproc1-he-de.apache.org) with ESMTP id D02B51FF3A1 for ; Fri, 18 Dec 2020 04:26:00 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamproc1-he-de.apache.org X-Spam-Flag: NO X-Spam-Score: 0.5 X-Spam-Level: X-Spam-Status: No, score=0.5 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.2, KAM_NUMSUBJECT=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamproc1-he-de.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-he-de.apache.org ([116.203.227.195]) by localhost (spamproc1-he-de.apache.org [116.203.196.100]) (amavisd-new, port 10024) with ESMTP id DJFRl7d6457U for ; Fri, 18 Dec 2020 04:25:59 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2a00:1450:4864:20::129; helo=mail-lf1-x129.google.com; envelope-from=karbarcca@gmail.com; receiver= Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id D3F577FA00 for ; Fri, 18 Dec 2020 04:25:59 +0000 (UTC) Received: by mail-lf1-x129.google.com with SMTP id o13so2121494lfr.3 for ; Thu, 17 Dec 2020 20:25:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Sxpz1Pv83BOgRyXfTqrR2lN3n+dRIFTUWEss/uXVtIA=; b=CXUCksmFtmIg9xF99e2uVyxu3hKcuYJMsu2Lx1wNc+pLimVzCx8+feMS8WHwq8dj/t b6XUi3EhliffzxCPhORm/u1e/fqt1uUtVAyyUM2jDkSocWr5Qg5hpH41BvofGV0z/1SF 9CAt1ufbSlhglTolf3GDs0H+oQOfP8phIYNZ/YhQRq2+j7Uv/PGHCDEnCrm7lCaQg68Z U9k+Yjc6IFacK9t+59qkpWIT7rrdrONLOWJkIx+CBkJ5ZEcAXbMh6mvDXolmVFhKwNGs RWCwzyR8BTM9Uwwc6DF+6iTQF3ZT1HJuE0OLEOwjlmFHmP7VYuf60p4HJNgyRdZoEjy/ QQcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Sxpz1Pv83BOgRyXfTqrR2lN3n+dRIFTUWEss/uXVtIA=; b=RNNCFAQJMIdXSec2Xtmts2VLMmS7Fs9P+tvkOj0dqAu502AtfCPUZ40vO23sGZpdM0 +SJ4/eRXoJtiNFERAZyUM3lCYFKfEbFWfU06GXzwt9dUeSuOP4OECKmRMMl2GPlSKbTW mwkh6Cztu/OYSpROSCAOkeOrJgojM8W0Newx4wTJwRYyFKhcEhkpTNQNrWUF5r6WKIdF +36G04VNOiT5jv17CadnWC81CRFtgj/5e6D839KB/eHZHx0dL4HTCZyTOfUHxPWHNuDb m90zk2oZBWTf8TeDSrSYTLSEwFPm2SY4WDKeallQ/1Bv1He9znG6swSpMNNtJOoRlC+D WR4g== X-Gm-Message-State: AOAM5307059B90odtGJwm5Tx9pIOtM7CmYlv97ZXfQ+TdTkQtEVaqlP2 sXXTR74XQwoIk3KNABCLsFcTNqvb6ZN1XTYs2H2lchlpbcSpxA== X-Google-Smtp-Source: ABdhPJznNAA5ETnLURUTi3sc/YDcxMr9+tkVHW7NeuIb1tzlm3ZISst+qFMgCnV/VRhNH46vkr4+xRZToXboK66BSL0= X-Received: by 2002:a19:494f:: with SMTP id l15mr753038lfj.456.1608265552175; Thu, 17 Dec 2020 20:25:52 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Jacob Quinn Date: Thu, 17 Dec 2020 21:25:41 -0700 Message-ID: Subject: Re: [javascript] cant get timestamps in arrow 2.0 To: user@arrow.apache.org Cc: emkornfield@gmail.com Content-Type: multipart/alternative; boundary="000000000000f127f605b6b580e8" --000000000000f127f605b6b580e8 Content-Type: text/plain; charset="UTF-8" > > Today, I think only C++ (and libraries that bind to it) have compression > implemented. I think a new PR for java was just opened in the last few > days. > Note the Julia implementation (Arrow.jl) supports compressing when writing and decompressing when reading. (Not that it really helps for the javascript side of things here, but just wanted to point it out as the Julia code is relatively new to the arrow project). On Thu, Dec 17, 2020 at 2:10 PM Andrew Clancy wrote: > Yep - that's where I was expecting it! > These guys appear to implement decompression using pako: > https://github.com/usnistgov/jsfive - might be a good route to look into. > > > > On Thu, 17 Dec 2020 at 19:19, Micah Kornfield > wrote: > >> I don't know the support for the compression codecs in Javascript, but i >> don't think anyone has attempted to implement them. >> >> I couldn't find the compression feature listed on the library status docs >> [1]. >> >> But we should add a line item for it. Today, I think only C++ (and >> libraries that bind to it) have compression implemented. I think a new PR >> for java was just opened in the last few days. >> >> [1] https://arrow.apache.org/docs/status.html >> >> On Thu, Dec 17, 2020 at 10:10 AM Andrew Clancy wrote: >> >>> So, I figured out the issue here - I had to remove compression from the >>> pyarrow feather.write_feather(compression='uncompressed'). Is there any >>> way to read a compressed feather file in arrow js? >>> See the comment under the first answer here: >>> https://stackoverflow.com/questions/64629670/how-to-write-a-pandas-dataframe-to-arrow-file/64648955#64648955 >>> I couldn't find anything in the arrow docs or notebooks on this - I'm >>> assuming that's related to javascript compression libraries being so >>> limited. >>> >>> >>> On Mon, 14 Dec 2020 at 21:32, Andrew Clancy wrote: >>> >>>> Hi all, >>>> >>>> I have a simple feather file created via a pandas to_feather with a >>>> datetime64[ns] column, and cannot get timestamps in javascript >>>> apache-arrow@2.0.0 >>>> >>>> See this notebook: >>>> https://observablehq.com/@nite/apache-arrow-timestamp-investigation >>>> >>>> I'm guessing I'm missing something, has anyone got any suggestions, or >>>> decent examples of reading a file created in pandas? I've seen in examples >>>> of apache-arrow@0.3.1 where dates stored as an array of 2 ints. >>>> >>>> File was created with: >>>> >>>> import pandas as pd >>>> pd.read_parquet('sample.parquet') >>>> df.to_feather('sample-seconds.feather') >>>> >>>> Final Q: I'm assuming this is the best place for this question? >>>> Happy to post elsewhere if there's any other forums, or if this should be a >>>> JIRA ticket? >>>> >>>> Thanks! >>>> Andy >>>> >>> --000000000000f127f605b6b580e8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
=C2=A0To= day, I think only C++ (and libraries that bind to it) have compression impl= emented.=C2=A0 I think a new PR for java was just opened in the last few da= ys.

Note the Julia implementation (Arro= w.jl) supports compressing when writing and decompressing when reading. (No= t that it really helps for the javascript side of things here, but just wan= ted to point it out as the Julia code is relatively new to the arrow projec= t).=C2=A0

On Thu, Dec 17, 2020 at 2:10 PM Andrew Clancy <nite@achren.org> wrote:
Yep - that'= s where I was expecting it!=C2=A0
These guys appear to implement decomp= ression using pako:=C2=A0https://github.com/usnistgov/jsfive=C2=A0- might be a g= ood route to look into.=C2=A0


=
On Thu= , 17 Dec 2020 at 19:19, Micah Kornfield <emkornfield@gmail.com> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft:1px solid rgb(204,204,204);padding-left:1ex">
I don'= ;t know the support for the compression codecs in Javascript, but i don'= ;t think anyone has attempted to implement them.

I could= n't find the compression feature listed on the library status docs [1].=

But we should add a line item for it.=C2=A0 Today= , I think only C++ (and libraries that bind to it) have compression impleme= nted.=C2=A0 I think a new PR for java was just opened in the last few days.=


On Thu, Dec 17, 2020 at 10:10 AM Andrew Clancy <nite@achren.org> wrote:
So, I = figured out the issue here - I had to remove compression=C2=A0from the pyar= row=C2=A0feather.write_feather(compression=3D'= uncompressed'). Is there any way to read a compressed feather fi= le in arrow js?=C2=A0
See the comment under the first answer here:=C2= =A0https://= stackoverflow.com/questions/64629670/how-to-write-a-pandas-dataframe-to-arr= ow-file/64648955#64648955
I couldn't=C2=A0find anything in the a= rrow docs or notebooks on this - I'm assuming that's related to jav= ascript=C2=A0compression=C2=A0libraries being so limited.=C2=A0
<= br>

On Mon, 14 Dec 2020 at 21:32, Andrew Clancy <nite@achren.org> wrote:
=

Hi al= l,

I have a simple feather file created via a pandas to_feather with = a datetime64[ns]=C2=A0column, and cannot get timestamps in javascript apach= e-arrow@2.0.0=C2=A0

See this notebook:
https://observablehq.com/@nite/apache-arrow-timestamp-investig= ation

I'm guessing I'm missing something, has anyone got = any suggestions, or decent examples of reading a file created in pandas? I&= #39;ve seen in examples of apache-arrow@0.3.1 where dates stored as an arra= y of 2 ints.=C2=A0

File was created with:=C2=A0

import pandas as pd
pd.read_parquet('sample.parquet')<= br>df.to_feather('sample-seconds.feather')

Final Q= : I'm assuming this is the best place for this question? Happy=C2=A0to = post elsewhere if there's any other forums, or if this should be a JIRA= ticket?

Thanks!
Andy

--000000000000f127f605b6b580e8--