Return-Path: X-Original-To: apmail-orc-user-archive@minotaur.apache.org Delivered-To: apmail-orc-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6899518648 for ; Tue, 29 Sep 2015 06:06:39 +0000 (UTC) Received: (qmail 72390 invoked by uid 500); 29 Sep 2015 06:05:16 -0000 Delivered-To: apmail-orc-user-archive@orc.apache.org Received: (qmail 71048 invoked by uid 500); 29 Sep 2015 06:05:15 -0000 Mailing-List: contact user-help@orc.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@orc.apache.org Delivered-To: mailing list user@orc.apache.org Received: (qmail 68299 invoked by uid 99); 29 Sep 2015 06:02:15 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Sep 2015 06:02:15 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id C2DF4C6DA6 for ; Tue, 29 Sep 2015 06:02:14 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.999 X-Spam-Level: * X-Spam-Status: No, score=1.999 tagged_above=-999 required=6.31 tests=[FSL_HELO_BARE_IP_2=1.999, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id eGf_GTfNVJi6 for ; Tue, 29 Sep 2015 06:02:06 +0000 (UTC) Received: from relayvx11b.securemail.intermedia.net (relayvx11b.securemail.intermedia.net [64.78.52.184]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id C4D1525C6F for ; Tue, 29 Sep 2015 06:00:38 +0000 (UTC) Received: from securemail.intermedia.net (localhost [127.0.0.1]) by emg-ca-1-1.localdomain (Postfix) with ESMTP id 48AA653EDC; Mon, 28 Sep 2015 23:00:37 -0700 (PDT) Subject: Re: Reading ORC Files from S3 MIME-Version: 1.0 x-echoworx-msg-id: b9191aa1-94d7-4bbe-8a7f-ae53af46b246 x-echoworx-emg-received: Mon, 28 Sep 2015 23:00:37.258 -0700 x-echoworx-action: delivered Received: from 10.254.155.14 ([10.254.155.14]) by emg-ca-1-1 (JAMES SMTP Server 2.3.2) with SMTP ID 85; Mon, 28 Sep 2015 23:00:37 -0700 (PDT) Received: from MBX080-W4-CO-1.exch080.serverpod.net (unknown [10.224.117.101]) by emg-ca-1-1.localdomain (Postfix) with ESMTP id 12E1353EDC; Mon, 28 Sep 2015 23:00:37 -0700 (PDT) Received: from MBX080-W4-CO-2.exch080.serverpod.net (10.224.117.102) by MBX080-W4-CO-1.exch080.serverpod.net (10.224.117.101) with Microsoft SMTP Server (TLS) id 15.0.1044.25; Mon, 28 Sep 2015 23:00:34 -0700 Received: from MBX080-W4-CO-2.exch080.serverpod.net ([10.224.117.102]) by mbx080-w4-co-2.exch080.serverpod.net ([10.224.117.102]) with mapi id 15.00.1044.021; Mon, 28 Sep 2015 23:00:34 -0700 From: Gopal Vijayaraghavan To: "user@orc.apache.org" CC: David Rosenstrauch , "rbalamohan@apache.org" Thread-Topic: Reading ORC Files from S3 Thread-Index: AQHQ+jDU/sya+8YTQEyogB1yLrlWV55S9xaAgABLEYCAACGsAP//oBgA Date: Tue, 29 Sep 2015 06:00:34 +0000 Message-ID: References: <5609A6B9.6030701@darose.net> <5609FAC5.6070305@darose.net> <560A1704.6070802@darose.net> In-Reply-To: <560A1704.6070802@darose.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.5.5.150821 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [67.188.8.167] x-source-routing-agent: Processed Content-Type: text/plain; charset="us-ascii" Content-ID: <336A0D97E088AC499862151A02D4B0ED@exch080.serverpod.net> Content-Transfer-Encoding: quoted-printable Hi, >OK, well that was easy. Figured out my issue and managed to get ORC >working over s3a. And got a huge speed-up over s3n! (On the order of >10x!) Cool! S3n is rather old now, while the aws-sdk updates keep s3a moving. >So yeah, I'm game for testing some new code when/if you're feeling >motivated to work on this. Feel free to email me off-list and we can >get into the details. +Rajesh - who's actively chasing down the ORC + S3 changes today. Your email came at an opportune moment, since Rajesh's ORC changes landed in hive-2.0 branch today https://github.com/apache/hive/commit/a4c43f0335b33a75d2e9f3dc53b3cd33f8f11 5cf Cheers, Gopal > >On 09/28/2015 10:43 PM, David Rosenstrauch wrote: >> Super helpful response - thanks so much! At least I know I'm not crazy >> now! (And shouldn't spend any more time on tweaks trying to get this to >> work on s3n.) >> >> Let me try to start testing this using out-of-the-box s3a protocol. (I >> haven't been able to get that to work at all yet - keep getting "Unable >> to load AWS credentials from any provider in the chain" errors.) Once >> I'm able to get that far I'd be up for trying to test some new code. (As >> long as it doesn't wind up taking too much time.) >> >> Will report back soon. >> >> Thanks again! >> >> DR >> >> On 09/28/2015 06:14 PM, Gopal Vijayaraghavan wrote: >>>> avail. I was hoping perhaps someone on the list here might >>>> be able to shed some light as to why we're having these problems >>>>and/or >>>> have some suggestions on how we might be able to work around them. >>> ... >>>> (I.e., theoretically ORC should be able to skip reading large >>>>portions >>>> of the index files by jumping directly to the index >>>> records that match the supplied search criteria. (Or at least jumping >>>>to >>>> a stripe close to them.)) But this is proving not to be the case. >>> >>> Not theoretically. ORC does that and that's the issue. >>> >>> S3n is badly broken for a columnar format & even S3A is missing a >>>couple >>> of features which are essential to get read performance over HTTP. >>> >>> Here's one example - every seek() disconnects & restablishes an SSL >>> connection in S3 (that fix is a ~2x perf increase for S3a). >>> >>> https://issues.apache.org/jira/browse/HADOOP-12444 >>> >>> >>> In another scenario we found that a readFully(colOffset,... colSize) >>>will >>> open an unbounded reader in S3n instead of reading the fixed chunk off >>> HTTP. >>> >>> https://issues.apache.org/jira/browse/HADOOP-11867 >>> >>> >>> The lack of this means that even the short-live keep-alive gets turned >>> off >>> by the S3 impl, when doing a forward-seek read pattern, because it is a >>> recv buffer-dropping disconnect, not a complete request. >>> >>> The Amazon proprietary S3 drivers are not subject to these problems, so >>> they work with ORC very well. It's the open source S3 filesystem impls >>> which are holding us back. >>> >>>> Is ORC simply unable to work efficiently against data stored on S3n? >>>> (I.e., due to network round-trips taking too long.) >>> >>> S3n is unable to handle any columnar format efficiently - it fires an >>> HTTP >>> GET for each seek, marked till end of the file. Any format which >>>requires >>> forward seeks or bounded readers is going to die via TCP window & >>> round-trip thrashing. >>> >>> >>> I know what's needed for s3a to work well with columnar readers >>> (Parquet/ORC/RCFile included) and future proof it so that it will work >>> fine when HTTP/2 arrives. >>> >>> If you're interested in being guinea pig for S3a fixes, it is currently >>> sitting on my back burner (I'm not a hadoop committer) - the FS fixes >>>are >>> about two weeks worth of work for a single motivated dev. >>> >>> Cheers, >>> Gopal >>> >>> >> > >