Mailing-List: contact user-help@orc.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@orc.apache.org
Subject: Re: Reading ORC Files from S3
MIME-Version: 1.0
From: Gopal Vijayaraghavan <gopal@hortonworks.com>
To: "user@orc.apache.org" <user@orc.apache.org>
CC: David Rosenstrauch <darose@darose.net>, "rbalamohan@apache.org"
	<rbalamohan@apache.org>
Thread-Topic: Reading ORC Files from S3
Thread-Index: AQHQ+jDU/sya+8YTQEyogB1yLrlWV55S9xaAgABLEYCAACGsAP//oBgA
Date: Tue, 29 Sep 2015 06:00:34 +0000
Message-ID: <D22F6686.35F6B%gopal@hortonworks.com>
References: <5609A6B9.6030701@darose.net>
 <FBA78400-EDCF-4D06-9C57-498A6B00722E@gmail.com>
 <D22F0399.35E6E%gopal@hortonworks.com> <5609FAC5.6070305@darose.net>
 <560A1704.6070802@darose.net>
In-Reply-To: <560A1704.6070802@darose.net>
Accept-Language: en-US
Content-Language: en-US
user-agent: Microsoft-MacOutlook/14.5.5.150821
Content-Type: text/plain; charset="us-ascii"
Content-ID: <336A0D97E088AC499862151A02D4B0ED@exch080.serverpod.net>
Content-Transfer-Encoding: quoted-printable

Hi,

>OK, well that was easy.  Figured out my issue and managed to get ORC
>working over s3a.  And got a huge speed-up over s3n!  (On the order of
>10x!)

Cool! S3n is rather old now, while the aws-sdk updates keep s3a moving.

>So yeah, I'm game for testing some new code when/if you're feeling
>motivated to work on this.  Feel free to email me off-list and we can
>get into the details.

+Rajesh - who's actively chasing down the ORC + S3 changes today.

Your email came at an opportune moment, since Rajesh's ORC changes landed
in hive-2.0 branch today

https://github.com/apache/hive/commit/a4c43f0335b33a75d2e9f3dc53b3cd33f8f11
5cf


Cheers,
Gopal

>
>On 09/28/2015 10:43 PM, David Rosenstrauch wrote:
>> Super helpful response - thanks so much!  At least I know I'm not crazy
>> now!  (And shouldn't spend any more time on tweaks trying to get this to
>> work on s3n.)
>>
>> Let me try to start testing this using out-of-the-box s3a protocol.  (I
>> haven't been able to get that to work at all yet - keep getting "Unable
>> to load AWS credentials from any provider in the chain" errors.)  Once
>> I'm able to get that far I'd be up for trying to test some new code. (As
>> long as it doesn't wind up taking too much time.)
>>
>> Will report back soon.
>>
>> Thanks again!
>>
>> DR
>>
>> On 09/28/2015 06:14 PM, Gopal Vijayaraghavan wrote:
>>>> avail.  I was hoping perhaps someone on the list here might
>>>> be able to shed some light as to why we're having these problems
>>>>and/or
>>>> have some suggestions on how we might be able to work around them.
>>> ...
>>>>   (I.e., theoretically ORC should be able to skip reading large
>>>>portions
>>>> of the index files by jumping directly to the index
>>>> records that match the supplied search criteria. (Or at least jumping
>>>>to
>>>> a stripe close to them.))  But this is proving not to be the case.
>>>
>>> Not theoretically. ORC does that and that's the issue.
>>>
>>> S3n is badly broken for a columnar format & even S3A is missing a
>>>couple
>>> of features which are essential to get read performance over HTTP.
>>>
>>> Here's one example - every seek() disconnects & restablishes an SSL
>>> connection in S3 (that fix is a ~2x perf increase for S3a).
>>>
>>> https://issues.apache.org/jira/browse/HADOOP-12444
>>>
>>>
>>> In another scenario we found that a readFully(colOffset,... colSize)
>>>will
>>> open an unbounded reader in S3n instead of reading the fixed chunk off
>>> HTTP.
>>>
>>> https://issues.apache.org/jira/browse/HADOOP-11867
>>>
>>>
>>> The lack of this means that even the short-live keep-alive gets turned
>>> off
>>> by the S3 impl, when doing a forward-seek read pattern, because it is a
>>> recv buffer-dropping disconnect, not a complete request.
>>>
>>> The Amazon proprietary S3 drivers are not subject to these problems, so
>>> they work with ORC very well. It's the open source S3 filesystem impls
>>> which are holding us back.
>>>
>>>> Is ORC simply unable to work efficiently against data stored on S3n?
>>>> (I.e., due to network round-trips taking too long.)
>>>
>>> S3n is unable to handle any columnar format efficiently - it fires an
>>> HTTP
>>> GET for each seek, marked till end of the file. Any format which
>>>requires
>>> forward seeks or bounded readers is going to die via TCP window &
>>> round-trip thrashing.
>>>
>>>
>>> I know what's needed for s3a to work well with columnar readers
>>> (Parquet/ORC/RCFile included) and future proof it so that it will work
>>> fine when HTTP/2 arrives.
>>>
>>> If you're interested in being guinea pig for S3a fixes, it is currently
>>> sitting on my back burner (I'm not a hadoop committer) - the FS fixes
>>>are
>>> about two weeks worth of work for a single motivated dev.
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>
>
>