Return-Path: X-Original-To: apmail-arrow-dev-archive@minotaur.apache.org Delivered-To: apmail-arrow-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4B43818A8F for ; Fri, 26 Feb 2016 01:11:56 +0000 (UTC) Received: (qmail 89983 invoked by uid 500); 26 Feb 2016 01:11:56 -0000 Delivered-To: apmail-arrow-dev-archive@arrow.apache.org Received: (qmail 89929 invoked by uid 500); 26 Feb 2016 01:11:56 -0000 Mailing-List: contact dev-help@arrow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@arrow.apache.org Delivered-To: mailing list dev@arrow.apache.org Received: (qmail 89916 invoked by uid 99); 26 Feb 2016 01:11:55 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Feb 2016 01:11:55 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 76F6EC0954 for ; Fri, 26 Feb 2016 01:11:55 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id WyYbt072PBIr for ; Fri, 26 Feb 2016 01:11:54 +0000 (UTC) Received: from mail-yw0-f179.google.com (mail-yw0-f179.google.com [209.85.161.179]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id 21A445F2C3 for ; Fri, 26 Feb 2016 01:11:54 +0000 (UTC) Received: by mail-yw0-f179.google.com with SMTP id g127so58003097ywf.2 for ; Thu, 25 Feb 2016 17:11:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=pSkcNpt/ZHp2m9CLZ5zwWVknid0pdKsWSjX3E58TeYU=; b=1AdKz8XxM/cU7ZcOBxtk2ix/hea9IdDS4F1ykyXoKH3U/F8Q+zyRqqJuVOxYeL0vpU 36/Ju3M4yNGMeL6k5hKWMUsaPABRuX+7NdAOn/JFeZ45Lg3ccHoph+1Z2dPtyBHSQQxn lT1xbkzTj0Lk/Q/D8eBX5aR4TDUCcf7NsogJjAzfni6zr6nD5/nS0WeEB3wXzqPMHlMm zJm0OTuA3XT0Dok7PWJtz9jCJX7BbKgnDnwnKkMx8yWuUtWyH4mbmWcfxlB70fbX3fJ6 L4Tk3eZSvyKfmlx3PmT9JD3tD1kLLcRbaAnV10NA9DqSRT2hsh2qRY7Ygv6snzocIRPQ 9QhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=pSkcNpt/ZHp2m9CLZ5zwWVknid0pdKsWSjX3E58TeYU=; b=hcN7w8c42QetgIZ4D/Wv0mq7m9eVSmFdPUm6g+x7SBWWfasVwWx5XNSUEypKN+iwvj LtJCLjBWGZfgUtFWY+ESugvRkXs8ZleQhVst1NjE/PeLMO8FSwejgJZAfV1FXy8Oe5QZ FgXRqyVBcA1MQf6icdybo+P/+5k7eR+LCss/2WqeIOvLzq5f6hiYULcy0M3lVRf1TmvZ XTJzDgfB5VI3qNdEl9E9VFOXkss9VRw/uxAeBuiDsxdcOtr42XIk6xIeKLRIxH2dYcdS Df0mM3aMhe9LUJb3hpcAqa2yWQcBST6JW4F8rwSKq9maDzAp+8OSuVJX69GMY/T33kob 9V+g== X-Gm-Message-State: AG10YOTvkOOkg6vYwwEPQWx8e5zTmddIWM8PT3pnkg99BhXpLlZ27fTbpk0fdk/YTWpmGc8xYiWuA/rHD3Fzzw== MIME-Version: 1.0 X-Received: by 10.13.218.198 with SMTP id c189mr27367666ywe.165.1456449113170; Thu, 25 Feb 2016 17:11:53 -0800 (PST) Received: by 10.37.221.130 with HTTP; Thu, 25 Feb 2016 17:11:53 -0800 (PST) Received: by 10.37.221.130 with HTTP; Thu, 25 Feb 2016 17:11:53 -0800 (PST) In-Reply-To: References: <339404DE-D43C-4A34-89B0-A7AD408BF09A@cloudera.com> Date: Thu, 25 Feb 2016 17:11:53 -0800 Message-ID: Subject: Re: Comparing with Parquet From: Pedro Miguel Duarte To: dev@arrow.apache.org Content-Type: multipart/alternative; boundary=94eb2c08192a062aae052ca1fc8f --94eb2c08192a062aae052ca1fc8f Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I was wondering if someone could also elaborate in the comparison with Tachyon (now called Alluxio) On Feb 25, 2016 5:08 PM, "Chenliang (Liang, DataSight)" < chenliang613@huawei.com> wrote: > In favor of Henry Robinson's points. > > In addition. Arrow is suitable for exchanging data high efficiently, but > the data size may just support TB level. Parquet can support more bigger > data, but the performance couldn't support fast query. > > So for PB level data and interactively query(second level), both couldn't > solve? > > Regards > Liang > -----=E9=82=AE=E4=BB=B6=E5=8E=9F=E4=BB=B6----- > =E5=8F=91=E4=BB=B6=E4=BA=BA: Henry Robinson [mailto:henry@cloudera.com] > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: 2016=E5=B9=B42=E6=9C=8826=E6=97=A5 = 0:20 > =E6=94=B6=E4=BB=B6=E4=BA=BA: dev@arrow.apache.org > =E4=B8=BB=E9=A2=98: Re: Comparing with Parquet > > Think of Parquet as a format well-suited to writing very large datasets t= o > disk, whereas Arrow is a format most suited to efficient storage in memor= y. > You might read Parquet files from disk, and then materialize them in memo= ry > in Arrow's format. > > Both formats are designed around the idiosyncrasies of the target medium: > Parquet is not designed to support efficient random access because disks > aren't good at that, but Arrow has fast random access as a core design > principle, to give just one example. > > Henry > > > On Feb 25, 2016, at 8:10 AM, Sourav Mazumder < > sourav.mazumder00@gmail.com> wrote: > > > > Hi All, > > > > New to this. And still trying to figure out where exactly Arrow fits > > in the ecosystem of various Big Data technologies. > > > > In that respect first thing which came to my mind is how does Arrow > > compare with parquet. > > > > In my understanding Parquet also supports a very efficient columnar > > format (with support for nested structure). It is already embraced > > (supported) by various technologies like Impala (origin), Spark, Drill > etc. > > > > The only think I see missing in Parquet is support for SIMD based > > vectorized operations. > > > > Am I right or am I missing many other differences between Arrow and > > parquet ? > > > > Regards, > > Sourav > --94eb2c08192a062aae052ca1fc8f--