From user-return-350-archive-asf-public=cust-asf.ponee.io@arrow.apache.org Wed Mar 18 22:40:24 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id D807F18025F for ; Wed, 18 Mar 2020 23:40:23 +0100 (CET) Received: (qmail 38771 invoked by uid 500); 18 Mar 2020 22:40:23 -0000 Mailing-List: contact user-help@arrow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@arrow.apache.org Delivered-To: mailing list user@arrow.apache.org Received: (qmail 38761 invoked by uid 99); 18 Mar 2020 22:40:23 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Mar 2020 22:40:23 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 70CE2C20A4 for ; Wed, 18 Mar 2020 22:40:22 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.001 X-Spam-Level: X-Spam-Status: No, score=0.001 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.2, KAM_SHORT=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id L-ax5GR3hJWY for ; Wed, 18 Mar 2020 22:40:21 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.222.177; helo=mail-qk1-f177.google.com; envelope-from=nugend@gmail.com; receiver= Received: from mail-qk1-f177.google.com (mail-qk1-f177.google.com [209.85.222.177]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 1DBB3BB810 for ; Wed, 18 Mar 2020 22:40:21 +0000 (UTC) Received: by mail-qk1-f177.google.com with SMTP id j4so82450qkc.11 for ; Wed, 18 Mar 2020 15:40:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:in-reply-to:references:subject:mime-version; bh=5MRZkUISVw2NuGfLgxlOTBcNlAZX3Lg9oPN1hCooyz8=; b=WCDDwcAVVL9k2l7KoF2fOJ0xmz4Gof+12a6WnwekllnGojfYgARkm+kyQFRSr00IcE XTbDgeyoR04Zj9/7EA5vN18Nudvmc1vIgeLZML/0nzuIFq+UOB3FBYz8MEOWEfDCQu4t fu+1mnmbRbUF9lneK6i56ofJS2faTBlsWW3xANk9/pzJDuawiL6i/rdv7SHGODkRGWFg Ov5RYqJ0TGobiSq/NzXiyI24Qbaqz7AWyS5KnLIGt8mjRhcDLoars5ZcHSgVqo1Sj6El LuQWg2PMvdTbYMYNrIX75BsxNj47Tfj3ML7yRs2sAhnCe4CmwNTg7MlwbUV7konnM5EP ZSpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:message-id:in-reply-to:references :subject:mime-version; bh=5MRZkUISVw2NuGfLgxlOTBcNlAZX3Lg9oPN1hCooyz8=; b=hvZkgmMNGuxt6RW7/Aw3rO2wLYmfZZ58FH0x6YTSF12Pq0FaSYG2tw4tKroyG46SAJ iGw5sTMUHGgwuJHPMIsMCgls102aL+G6oNRM4KC57XpL7Mpu7wS1HVH3v+7oYO+mGq1+ Q2nLGnj1uNAR7oSAfEfzYgNoxrbVTDMcl113lgbbBSPd3t3MVS+iHC7eBzHJ8763yEp+ 0vLI6xKug1hbVEbUWws7mfcJbn6trUiIZ7QcD/SMVn2tD5sSft6zzZwvdgqHhNjXCcOl daPctVD8LIKjYF44kgC7lBQxOFSXVYO5XOjdusbDYGiHc1OBn372PYL4zVmZmQqEDZ/6 C2pA== X-Gm-Message-State: ANhLgQ3y9vkTc0l5t/ItGZp02mfCkYUTG3PzXBfmhbhZRrkEvEmf7UEk j/7jLA/LSub/bVZ8h5LyRQB9r5n4 X-Google-Smtp-Source: ADFU+vuSeJw4cBP/gj7QwXOIcPhUZNeARL6YvJgZa4Q/sFs2b9Zs7PflUTvWlY+pgSNu8icSO+TAag== X-Received: by 2002:a37:7783:: with SMTP id s125mr175600qkc.492.1584571220409; Wed, 18 Mar 2020 15:40:20 -0700 (PDT) Received: from [192.0.2.1] ([8.46.116.186]) by smtp.gmail.com with ESMTPSA id f203sm268665qke.100.2020.03.18.15.40.19 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 18 Mar 2020 15:40:20 -0700 (PDT) Date: Wed, 18 Mar 2020 18:40:01 -0400 From: Daniel Nugent To: user@arrow.apache.org Message-ID: <49783f34-2fe0-4590-bb79-52d91d087574@Spark> In-Reply-To: <87a74d8hag.fsf@uwaterloo.ca> References: <87a74d8hag.fsf@uwaterloo.ca> Subject: Re: Access to parquet elements in Row via Rust API X-Readdle-Message-ID: 49783f34-2fe0-4590-bb79-52d91d087574@Spark MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="5e72a353_515f007c_b82" --5e72a353_515f007c_b82 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline I believe the Rust API is still nascent. But you can get that by looking = through the Metadata. It=E2=80=99s a bit nastily nested though. The type = descent path is =46ile > ParquetMetaData > RowGroupMetaData > ColumnChunk= MetaData > ColumnDescriptor > &str You are currently required to track the mappings between index and column= name yourself. -Dan Nugent On Mar 18, 2020, 17:49 -0400, Sebastian =46ischmeister , wrote: > Hi, > > I'm trying to write a simple Rust program that accesses a parquet file,= looks for some values, and prints them. > > I took the example from =5B1=5D and can pick up individual column value= s in each row by directly addressing them (e.g., record.get=5Flong(147)) = ). However, there are two problems with that: > > 1) It's unclear whether 147 is really the column I want to read. > 2) If there's a change in the provided parquet file, all the absolute n= umbers may change. > > I was hoping that there's a way to address an element through a hashmap= as in record.get(=22foo=22). Does the Rust API currently support this=3F= > > Thanks, > Sebastian > > =5B1=5D https://docs.rs/parquet/0.16.0/parquet/file/index.html --5e72a353_515f007c_b82 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline
I believe the Rust API is still nascent. But you ca= n get that by looking through the Metadata. It=E2=80=99s a bit nastily ne= sted though. The type descent path is =46ile > ParquetMetaData > Ro= wGroupMetaData > ColumnChunkMetaData > ColumnDescriptor > &s= tr

You are currently required to track the mappings be= tween index and column name yourself.

-Dan Nugent
On Mar 18, 2020, 17:49 -0400, Sebas= tian =46ischmeister <sfischme=40uwaterloo.ca>, wrote:
Hi= ,

I'm trying to write a simple Rust program that accesses a parquet file, l= ooks for some values, and prints them.

I took the example from =5B1=5D and can pick up individual column values = in each row by directly addressing them (e.g., record.get=5Flong(147)) ).= However, there are two problems with that:

1) It's unclear whether 147 is really the column I want to read.
2) If there's a change in the provided parquet file, all the absolute num= bers may change.

I was hoping that there's a way to address an element through a hashmap a= s in record.get(=22foo=22). Does the Rust API currently support this=3F
Thanks,
Sebastian

=5B1=5D https://docs.rs/parquet/0.16.0/parquet/file/index.html
--5e72a353_515f007c_b82--