From user-return-1061-archive-asf-public=cust-asf.ponee.io@arrow.apache.org Mon Mar 8 17:08:14 2021 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-ec2-va.apache.org (mxout1-ec2-va.apache.org [3.227.148.255]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id 99D29180661 for ; Mon, 8 Mar 2021 18:08:14 +0100 (CET) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-ec2-va.apache.org (ASF Mail Server at mxout1-ec2-va.apache.org) with SMTP id D99EE43340 for ; Mon, 8 Mar 2021 17:08:13 +0000 (UTC) Received: (qmail 9753 invoked by uid 500); 8 Mar 2021 17:08:13 -0000 Mailing-List: contact user-help@arrow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@arrow.apache.org Delivered-To: mailing list user@arrow.apache.org Received: (qmail 9743 invoked by uid 99); 8 Mar 2021 17:08:13 -0000 Received: from spamproc1-he-fi.apache.org (HELO spamproc1-he-fi.apache.org) (95.217.134.168) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Mar 2021 17:08:13 +0000 Received: from localhost (localhost [127.0.0.1]) by spamproc1-he-fi.apache.org (ASF Mail Server at spamproc1-he-fi.apache.org) with ESMTP id 62CB2C034A for ; Mon, 8 Mar 2021 17:08:12 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamproc1-he-fi.apache.org X-Spam-Flag: NO X-Spam-Score: 0.199 X-Spam-Level: X-Spam-Status: No, score=0.199 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=0.2, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-ec2-va.apache.org ([116.203.227.195]) by localhost (spamproc1-he-fi.apache.org [95.217.134.168]) (amavisd-new, port 10024) with ESMTP id ofbZAgJ5xfVs for ; Mon, 8 Mar 2021 17:08:11 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=193.50.0.66; helo=sirona.cnrgh.fr; envelope-from=jonathan.mercier@cnrgh.fr; receiver= Received: from sirona.cnrgh.fr (sirona.cnrgh.fr [193.50.0.66]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTP id 09B1DBC4FF for ; Mon, 8 Mar 2021 17:08:10 +0000 (UTC) Received: from [192.168.1.23] (129.203.39.62.rev.sfr.net [62.39.203.129]) (Authenticated sender: jmercier) by sirona.cnrgh.fr (Postfix) with ESMTPSA id 46092DF32D; Mon, 8 Mar 2021 18:08:04 +0100 (CET) Message-ID: Subject: Re: [python] calling filter method raise this error did not recognize Python value type when inferring an Arrow data type From: jonathan mercier To: emkornfield@gmail.com, user@arrow.apache.org Date: Mon, 08 Mar 2021 18:08:03 +0100 In-Reply-To: References: <05e6b3d3ace94d0ca778d8cde5e2caafa2e1d8af.camel@cnrgh.fr> <7d000ce285bc9b7e5b3fdd48e6373b76d9dc60c8.camel@cnrgh.fr> Content-Type: multipart/alternative; boundary="=-hhFF39ZcRaJkKm/IcM1H" User-Agent: Evolution 3.38.4 (3.38.4-1.fc33) MIME-Version: 1.0 --=-hhFF39ZcRaJkKm/IcM1H Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Oh yes it is not the same method as pyarrow.parquet … filters thanks best regards Le vendredi 05 mars 2021 à 20:29 -0800, Micah Kornfield a écrit : > Hi Jonathan, > Looking at the docs [1], I think the filter is supposed to be a > boolean mask.  So unfortunately, I think there are a few steps: > > Use an equality kernel [1] on the column of interest and then pass > that as an argument to filter. > > -Micah > > > [1] https://arrow.apache.org/docs/python/generated/pyarrow.compute.filter.html?highlight=table%20filter#pyarrow-compute-filter > (Table.filter here points for full usage) > [2] https://arrow.apache.org/docs/python/generated/pyarrow.compute.equal.html#pyarrow.compute.equal > > On Thu, Mar 4, 2021 at 2:24 AM jonathan mercier > wrote: > > I miss to tell that foo is a type of pyarrow.Table > > > > I load the data like this: > > > > from pyarrow.parquet import read_table > > foo = read_table(somewhere) > > --                 Researcher computational biology                 PhD, Jonathan MERCIER                              Bioinformatics (LBI)                 2, rue Gaston                 Crémieux                 91057 Evry Cedex                                           Tel :(+33)1 60 87 83 44                 Email :jonathan.mercier@cnrgh.fr                               --=-hhFF39ZcRaJkKm/IcM1H Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable
Oh yes it is not the same method as pyarrow.p= arquet =E2=80=A6 filters

thanks

best regards

Le vendredi 05 mars 2021 =C3= =A0 20:29 -0800, Micah Kornfield a =C3=A9crit :
Hi Jonathan,
Looking at the docs [1], = I think the filter is supposed to be a boolean mask.  So unfortunately= , I think there are a few steps:

Use an equality k= ernel [1] on the column of interest and then pass that as an argument to fi= lter.

-Micah


[1] h= ttps://arrow.apache.org/docs/python/generated/pyarrow.compute.filter.html?h= ighlight=3Dtable%20filter#pyarrow-compute-filter (Table.filter here poi= nts for full usage)
[2] 
I m= iss to tell that foo is a type of pyarrow.Table

I load th= e data like this:

from pyarrow.parquet import read_table<= br>foo =3D read_table(somewhere)


-- 
                Research= er computational biology
         &nb= sp;      PhD, Jonathan MERCIER
       = ;     
          =       Bioinformatics (LBI)
     =            2, rue Gaston<= /div>
 &= nbsp;           &nbs= p;  Cr=C3=A9mieux
         &nb= sp;      91057 Evry Cedex
     &= nbsp;      
      &nbs= p;     
        &= nbsp;       Tel :(+33)1 60 87 83 44
  &nbs= p;             = Email :jonathan.mercier@cnrgh.= fr
=             &nb= sp;   
          =   
--=-hhFF39ZcRaJkKm/IcM1H--