From user-return-206-archive-asf-public=cust-asf.ponee.io@orc.apache.org Mon Mar 26 20:47:55 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 023F4180649 for ; Mon, 26 Mar 2018 20:47:54 +0200 (CEST) Received: (qmail 40586 invoked by uid 500); 26 Mar 2018 18:47:54 -0000 Mailing-List: contact user-help@orc.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@orc.apache.org Delivered-To: mailing list user@orc.apache.org Received: (qmail 40562 invoked by uid 99); 26 Mar 2018 18:47:53 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Mar 2018 18:47:53 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id F21101800D7 for ; Mon, 26 Mar 2018 18:47:52 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.021 X-Spam-Level: X-Spam-Status: No, score=-0.021 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=iq80-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id oKQPTk-ntnF6 for ; Mon, 26 Mar 2018 18:47:52 +0000 (UTC) Received: from mail-pg0-f45.google.com (mail-pg0-f45.google.com [74.125.83.45]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id B1DB35F3E3 for ; Mon, 26 Mar 2018 18:47:51 +0000 (UTC) Received: by mail-pg0-f45.google.com with SMTP id m15so7625195pgc.1 for ; Mon, 26 Mar 2018 11:47:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iq80-com.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=vjnNHk/d9EGwo/OOFmvFDV5icoLfxMikmlGDY/rG71c=; b=A0w7iqujkTFUmZsuZm9LhP1WotLU7W2ISsWlACbcPtHrFZWPjTaVej6OAm7LbiVACU Dpto4GwRiU7qFMDl5Vj57/d3LHMDLhIV/ZseH8PjEpo3s2mGOTOV9F7/1OYoeNUlt+gh HYUJXfbFrfRnr0t4Z5vwOksmQQ8DQxsDS3V1nb/ilc4BDqS1BwWPEnAs5Ef6inP69tWz u3+Rmn1PZ9z2RraL8Vca+Qy8Ff+G4NDPKxAlqT/rv0xphW5U75Sd6xMdxx4E7mGtExW7 6Fh+DhOleSZVOPcGC8YKcPOWsUWU21Y6XR7aVCENA1hGiYUBNP97SK9AuFdxjcO5dIcS lwJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=vjnNHk/d9EGwo/OOFmvFDV5icoLfxMikmlGDY/rG71c=; b=BLjwI6cGHcSPSg4LvM8ViyhT1ZjOBaa8WGyuuyJS+9uuoqoQGc1aSfNRVAl7JZFiix X9mjJVKmxB8LlzPbt0traUqWyDSknIWp1p/HRIh68m5e+mZsTOqnQwxKX208O182HgaP hNlsMgiyo1iTDL2FeCb6grQeEHT7c7NCH+qXmXje/Olij6DZx/LkM1BfDqJn4W5Ar1t5 lvoELPgqbKsmJi/ebIooHPInIzF4SGgPKD6y+7VLEMHs7Ma+j60xKVrsv5THsp54lWYG +RQycZyyRHZCG/iVAwdAm130iJfyAr8LdRlNe6GS2pYm0QMS8LLpZN0YEnTcyH0I/1bU +UTA== X-Gm-Message-State: AElRT7Gc4oItDtE0bYseDdw+FoEki7LKXWqNEAqZvFeYb6VLw08BCa0A W+yjCvbbQv6u4qePG/o/F1BbT2olW5A= X-Google-Smtp-Source: AIpwx4+Q2dEH85hT7NzkQusc2gu/8pgIzQzENbR6ptLMwNoGnMHMJVQQg9m7jSGxW1WGUhh1dTS81g== X-Received: by 10.98.13.71 with SMTP id v68mr5316201pfi.69.1522090070732; Mon, 26 Mar 2018 11:47:50 -0700 (PDT) Received: from ?IPv6:2620:10d:c082:10e4:c6c:ef6f:3cec:2796? ([2620:10d:c090:200::6:9ace]) by smtp.gmail.com with ESMTPSA id a28sm262336pgd.38.2018.03.26.11.47.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 26 Mar 2018 11:47:49 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\)) Subject: Re: ORC double encoding optimization proposal From: Dain Sundstrom In-Reply-To: Date: Mon, 26 Mar 2018 11:47:47 -0700 Cc: "user@orc.apache.org" Content-Transfer-Encoding: quoted-printable Message-Id: References: <17B91B6B0D9BBC44A1682DABC201C53552055763@SHSMSX104.ccr.corp.intel.com> <2BE544BB-09A7-4323-9895-02F40C6FDFC6@hortonworks.com> To: dev@orc.apache.org X-Mailer: Apple Mail (2.3445.5.20) Doubling the seeks would be a big deal for many of our installations. I = practice it means that we would need to fully buffer the two streams = (assuming they are placed after each other) to avoid the extra seeks. = If this could be done in one stream, I think it would be a big = improvement. Is it possible to place the data row oriented, or to use = sub chunks in the stream (e.g. N items from a then N items from b), so = you can still stream? -dain > On Mar 26, 2018, at 2:24 AM, Xiening Dai wrote: >=20 > Where does the 2x IO drop come from? Based on Cheng Xu=E2=80=99s data, = Split + Zstd has ~15% improvement over PlainV2 + Zstd in terms of the = file size. If I understand correctly, the total number of IO reads are = almost the same, but Split will need an additional seek for each read. >=20 > The random IOPS would eventually determines the throughput of HDD. IO = queue can build up quickly when there are too many seeks and then = drastically affects read/write performance. That=E2=80=99s the major = concern, and it=E2=80=99s not related to locality.=20 >=20 >=20 >> On Mar 26, 2018, at 2:47 PM, Gopal Vijayaraghavan = wrote: >>=20 >>=20 >>> 2. Under seek or predicate pushdown scenario, there=E2=80=99s no = need to load the entire stream. >>=20 >> Yes, that is a valid scenario where the reader reads partial-streams = & causes random IO. >>=20 >> The current double encoding is actually 2 streams today & will = continue to use 2 streams for the FLIP implementation. >>=20 >> The SPLIT implementation will go from the current 2 streams to 4 = streams (i.e 1+1->1+3 streams) & the total data IO will drop by ~2x or = so. More so if one of the streams can be suppressed (like in my IoT = data-set, where the sign-bit is always +ve for my electric meter data). >>=20 >> The trade-offs seem to be working out on regular HDDs with locality & = for LLAP SSD caches - if your use-cases are different, I'd like to hear = more about it. >>=20 >> The only significant random IO delays expected seem to be entirely = within the HDFS API network hops (which offers 0% locality when data is = erasure coded or for cloud-storage), which I hope to fix in the = Hadoop-3.x branch with a new API. >>=20 >> Cheers, >> Gopal >>=20 >>=20 >=20