Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B3F2418D9F for ; Sat, 27 Feb 2016 01:34:35 +0000 (UTC) Received: (qmail 42557 invoked by uid 500); 27 Feb 2016 01:34:34 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 42489 invoked by uid 500); 27 Feb 2016 01:34:34 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 42479 invoked by uid 99); 27 Feb 2016 01:34:33 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 27 Feb 2016 01:34:33 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 452C61804A8 for ; Sat, 27 Feb 2016 01:34:33 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.998 X-Spam-Level: * X-Spam-Status: No, score=1.998 tagged_above=-999 required=6.31 tests=[AC_DIV_BONANZA=0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=lendingclub.onmicrosoft.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id ECRthDQUU47t for ; Sat, 27 Feb 2016 01:34:27 +0000 (UTC) Received: from na01-bl2-obe.outbound.protection.outlook.com (mail-bl2on0055.outbound.protection.outlook.com [65.55.169.55]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id AE5375FAED for ; Sat, 27 Feb 2016 01:34:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=LendingClub.onmicrosoft.com; s=selector1-lendingclub-com; h=From:To:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=x6lATJSpKPztt6yv+iFC/iCRPQCMRTu0Lh3Aa+ZxJYE=; b=Mndgd/0PFbhQ+e13ehRztb7aPLZK4p2/Q5uxCvLu8GPgIxhzZL1/OUSB8ItE+pz9vIfpN3dLnq9ZnW2wlJXR5xaUCaEU6jeg4P16roH66WDGD2dvMfV8T4BdBe66pSpmXK+tyuBuB7wLbibfVv31tAkDRFZ0YyDvo2mG2frmnu8= Received: from BY1PR0601MB1403.namprd06.prod.outlook.com (10.162.111.157) by BY1PR0601MB1401.namprd06.prod.outlook.com (10.162.111.155) with Microsoft SMTP Server (TLS) id 15.1.409.15; Sat, 27 Feb 2016 01:34:18 +0000 Received: from BY1PR0601MB1403.namprd06.prod.outlook.com ([10.162.111.157]) by BY1PR0601MB1403.namprd06.prod.outlook.com ([10.162.111.157]) with mapi id 15.01.0409.024; Sat, 27 Feb 2016 01:34:18 +0000 From: Rajit Saha To: "user@hive.apache.org" Subject: Running hive queries in different queue Thread-Topic: Running hive queries in different queue Thread-Index: AQHRcP75t287NPW+TkilKyWWo6WKwA== Date: Sat, 27 Feb 2016 01:34:18 +0000 Message-ID: References: <10107504-0098-4284-93C6-DB6CB1B8B0CC@hortonworks.com> , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: hive.apache.org; dkim=none (message not signed) header.d=none;hive.apache.org; dmarc=none action=none header.from=lendingclub.com; x-originating-ip: [165.193.232.161] x-microsoft-exchange-diagnostics: 1;BY1PR0601MB1401;5:FqMMbEPLDn3I1MLuZuLR8aIp/N1+XMAyNRpK4RR+DR9IwSju/5Igra4jjNn69Bm/mSqT4G6bSXOmaetzU1eEdViXPo2IkRbd7RoG/5+MVHPVZCRCnMA/klB+W2z75EHPDIoc2LZhG946i299lustwQ==;24:IakVFDw2LTng1cdRpvLIRbft8mBjF2BCo4eIZu+NCJvqaKZleGBsAIbfvoXL1hIo7IFot0BpiHr9ZSZ5iQ8e+0jEScP5F2PxPqzE2stEdpI= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BY1PR0601MB1401; x-ms-office365-filtering-correlation-id: 0f3b4cc6-2b16-4cc7-17df-08d33f161c5e x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001);SRVR:BY1PR0601MB1401;BCL:0;PCL:0;RULEID:;SRVR:BY1PR0601MB1401; x-forefront-prvs: 086597191B x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(26234003)(26244003)(479174004)(38564003)(377454003)(51914003)(24454002)(71364002)(377424004)(586003)(50986999)(19580405001)(19580395003)(76176999)(93886004)(3280700002)(33656002)(3846002)(66066001)(3660700001)(6116002)(77096005)(2950100001)(2900100001)(450100001)(2906002)(1096002)(11100500001)(15975445007)(1220700001)(5008740100001)(5002640100001)(40100003)(122556002)(189998001)(92566002)(99286002)(106116001)(2351001)(87936001)(229853001)(5001960100002)(107886002)(54356999)(110136002)(102836003)(10400500002)(19617315012)(5890100001)(86362001)(36756003)(2501003)(16236675004)(104396002);DIR:OUT;SFP:1101;SCL:1;SRVR:BY1PR0601MB1401;H:BY1PR0601MB1403.namprd06.prod.outlook.com;FPR:;SPF:None;MLV:sfv;LANG:en; spamdiagnosticoutput: 1:23 spamdiagnosticmetadata: NSPM Content-Type: multipart/alternative; boundary="_000_BCC3D1B36FD54CB38FC61AE6236DF569lendingclubcom_" MIME-Version: 1.0 X-OriginatorOrg: lendingclub.com X-MS-Exchange-CrossTenant-originalarrivaltime: 27 Feb 2016 01:34:18.3693 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: d95f572c-3b4c-4722-a256-83f9318b8658 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY1PR0601MB1401 --_000_BCC3D1B36FD54CB38FC61AE6236DF569lendingclubcom_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Hi I want to run hive query in a queue others than "default" queue from hive c= lient command line . Can anybody please suggest a way to do it. Regards Rajit On Feb 26, 2016, at 07:36, Patrick Duin > wrote: Hi Prasanth. Thanks for the quick reply! The logs don't show much more of the stacktrace I'm afraid: java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.r= un(OrcInputFormat.java:809) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExec= utor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe= cutor.java:615) at java.lang.Thread.run(Thread.java:745) The stacktrace isn't really the issue though. The NullPointer is a symptom = caused by not being able to return any stripes, if you look at the line in = the code it is because the 'stripes' field is null which should never happ= en. This, we think, is caused by failing namenode network traffic. We would= have lots of IO warning in the logs saying block's cannot be found or e.g.= : 16/02/01 13:20:34 WARN hdfs.BlockReaderFactory: I/O error constructing remo= te block reader. java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.ipc.Client.call(Client.java:1448) at org.apache.hadoop.ipc.Client.call(Client.java:1400) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufR= pcEngine.java:232) at com.sun.proxy.$Proxy32.getServerDefaults(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslat= orPB.getServerDefaults(ClientNamenodeProtocolTranslatorPB.java:268) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessor= Impl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethod= AccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(R= etryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryIn= vocationHandler.java:102) at com.sun.proxy.$Proxy33.getServerDefaults(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getServerDefaults(DFSClient.jav= a:1007) at org.apache.hadoop.hdfs.DFSClient.shouldEncryptData(DFSClient.jav= a:2062) at org.apache.hadoop.hdfs.DFSClient.newDataEncryptionKey(DFSClient.= java:2068) at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransf= erClient.checkTrustAndSend(SaslDataTransferClient.java:208) at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransf= erClient.peerSend(SaslDataTransferClient.java:159) at org.apache.hadoop.hdfs.net.TcpPeerServer.peerFromSocketAndKey(Tc= pPeerServer.java:90) at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java= :3123) at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReade= rFactory.java:755) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFr= omTcp(BlockReaderFactory.java:670) at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFacto= ry.java:337) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream= .java:576) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputS= tream.java:800) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:8= 48) at java.io.DataInputStream.readFully(DataInputStream.java:195) at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromF= ooter(ReaderImpl.java:407) at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.ja= va:311) at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.ja= va:228) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.p= opulateAndCacheStripeDetails(OrcInputFormat.java:885) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.r= un(OrcInputFormat.java:771) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExec= utor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe= cutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.InterruptedException at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:400) at java.util.concurrent.FutureTask.get(FutureTask.java:187) at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.ja= va:1047) at org.apache.hadoop.ipc.Client.call(Client.java:1442) ... 33 more Our job doesn't always fail sometimes splits get calculated. We suspect whe= n the namenode is too busy our job maybe hits some time-outs and the whole = thing fails. Our intuition has been the same as you suggest, bigger files is better. But= we see a degradation in performance as soon as our files get bigger than t= he ORC block size. Keeping file size within ORC block size sounds silly but= when looking at the code (OrcInputFormat) we think it cuts out a bunch of= code that is causing us problems. The code we are trying to hit is: https:= //github.com/apache/hive/blob/release-0.14.0/ql/src/java/org/apache/hadoop/= hive/ql/io/orc/OrcInputFormat.java#L656. Avoiding the scheduling. In our case we are not using any SARG but we do use column projection. Any idea why if we query the data via Hive we don't have this issue? Let me know if you need more information. Thanks for the insights, much app= reciated. Kind regards, Patrick 2016-02-25 22:20 GMT+01:00 Prasanth Jayachandran >: > On Feb 25, 2016, at 3:15 PM, Prasanth Jayachandran > wrote: > > Hi Patrick > > Can you paste entire stacktrace? Looks like NPE happened during split gen= eration but stack trace is incomplete to know what caused it. > > In Hive 0.14.0, the stripe size is changed to 64MB. The default block siz= e for ORC files is 256MB. 4 stripes can fit a block. ORC does padding to av= oid stripes straddling HDFS blocks. During split calculation, ORC footer wh= ich contains stripe level column statistics is read to perform split prunin= g based on predicate condition specified via SARG(Search Argument). > > For example: Assume column =91state=92 is sorted and the predicate condit= ion is =91state=92=3D=93CA" > Stripe 1: min =3D AZ max =3D FL > Stripe 2: min =3D GA max =3D MN > Stripe 3: min =3D MS max =3D SC > Stripe 4: min =3D SD max =3D WY > > In this case, only stripe 1 satisfies the above predicate condition. So o= nly 1 split with stripe 1 will be created. > So if there are huge number of small files, then footers from all files h= as to be read to do split pruning. If there are few number of large files t= hen only few footers have to be read. Also the minimum splittable position = is stripe boundary. So having fewer large files has the advantage of readin= g less data during split pruning. > > If you can send me the full stacktrace, I can tell what is causing the ex= ception here. I will also let you know of any workaround/next hive version = with the fix. > > In more recent hive versions, hive 1.2.0 onwards. OrcInputFormat is has s= trategies to decided when to read footers and when not to read footers auto= matically. You can configure the strategy that you want based on the worklo= ad. In case of many small files, footers will not be read and with large fi= les footers will be read for split pruning. The default strategy does it automatically (choosing between when to read a= nd when not to footers). It is configurable as well. > > Thanks > Prasanth > >> On Feb 25, 2016, at 7:08 AM, Patrick Duin > wrote: >> >> Hi, >> >> We've recently moved one of our datasets to ORC and we use Cascading and= Hive to read this data. We've had problems reading the data via Cascading,= because of the generation of splits. >> We read in a large number of files (thousands) and they are about 1GB ea= ch. We found that the split calculation took minutes on our cluster and oft= en didn't succeed at all (when our namenode was busy). >> When digging through the code of the 'org.apache.hadoop.hive.ql.io.orc.O= rcInputFormat.class' we figured out that if we make the files less then the= ORC block size (256MB) the code would avoid lots of namenode calls. We app= lied this solution and made our files smaller and that solved the problem. = Split calculation in our job went from 10+ mins to a couple of seconds and = always succeeds. >> We feel it is counterintuitive as bigger files are usually better in HDF= S. We've also seen that doing a hive query on the data does not present thi= s problem. Internally Hive seem to take a completely different execution pa= th and is not using the OrcInputFormat but uses 'org.apache.hadoop.hive.ql.= io.CombineHiveInputFormat.class'. >> >> Can someone explain the reason for this difference or shed some light on= the behaviour we are seeing? Any help will be greatly appreciated. We are = using hive-0.14.0. >> >> Kind regards, >> Patrick >> >> Here is the stack-trace that we would see when our Cascading job failed = to calculate the splits: >> Caused by: java.lang.RuntimeException: serious problem >> at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$Context.waitFo= rTasks(OrcInputFormat.java:478) >> at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplits= Info(OrcInputFormat.java:949) >> at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcI= nputFormat.java:974) >> at com.hotels.corc.mapred.CorcInputFormat.getSplits(CorcInputForm= at.java:201) >> at cascading.tap.hadoop.io.MultiInputFormat.getSplits(MultiInputF= ormat.java:200) >> at cascading.tap.hadoop.io.MultiInputFormat.getSplits(MultiInputF= ormat.java:142) >> at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSub= mitter.java:624) >> at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmit= ter.java:616) >> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(Job= Submitter.java:492) >> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296) >> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:415) >> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroup= Information.java:1628) >> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293) >> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:585) >> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:580) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:415) >> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroup= Information.java:1628) >> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient= .java:580) >> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:57= 1) >> at cascading.flow.hadoop.planner.HadoopFlowStepJob.internalNonBlo= ckingStart(HadoopFlowStepJob.java:106) >> at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java= :265) >> at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:184) >> at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:146) >> at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:48) >> ... 4 more >> Caused by: java.lang.NullPointerException >> at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator= .run(OrcInputFormat.java:809) > ________________________________ DISCLAIMER: The information transmitted is intended only for the person or = entity to which it is addressed and may contain confidential and/or privile= ged material. Any review, re-transmission, dissemination or other use of, o= r taking of any action in reliance upon this information by persons or enti= ties other than the intended recipient is prohibited. If you received this = in error, please contact the sender and destroy any copies of this document= and any attachments. --_000_BCC3D1B36FD54CB38FC61AE6236DF569lendingclubcom_ Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable
Hi 

I want to run hive query in a queue others t= han "default" queue from hive client command line . Can anybody p= lease suggest a way to do it.

Regards
Rajit

On Feb 26, 2016, at 07:36, Patrick Duin <patduin@gmail.com> wrote:

Hi Prasanth.

Thanks for the quick reply!

The logs don't show much more of the stacktrace I'm afraid:
java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat$SplitGenerator.run(OrcInputFormat.java:809)
        at java.util.concurrent.ThreadPo= olExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPo= olExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.j= ava:745)


The stacktrace isn't really the issue though. The NullPointer is a symptom = caused by not being able to return any stripes, if you look at the line in = the code it is  because the 'stripes' field is null which should never= happen. This, we think, is caused by failing namenode network traffic. We would have lots of IO warning in the = logs saying block's cannot be found or e.g.:
16/02/01 13:20:34 WARN hdfs.BlockReaderFactory: I/O error constructing remo= te block reader.
java.io.IOException: java.lang.InterruptedException
        at org.apache.hadoop.ipc.Client.= call(Client.java:1448)
        at org.apache.hadoop.ipc.Client.= call(Client.java:1400)
        at org.apache.hadoop.ipc.Protobu= fRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy32.getSer= verDefaults(Unknown Source)
        at org.apache.hadoop.hdfs.protoc= olPB.ClientNamenodeProtocolTranslatorPB.getServerDefaults(ClientNamenodePro= tocolTranslatorPB.java:268)
        at sun.reflect.NativeMethodAcces= sorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAcces= sorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodA= ccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invo= ke(Method.java:606)
        at org.apache.hadoop.io.retry.Re= tryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.Re= tryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy33.getSer= verDefaults(Unknown Source)
        at org.apache.hadoop.hdfs.DFSCli= ent.getServerDefaults(DFSClient.java:1007)
        at org.apache.hadoop.hdfs.DFSCli= ent.shouldEncryptData(DFSClient.java:2062)
        at org.apache.hadoop.hdfs.DFSCli= ent.newDataEncryptionKey(DFSClient.java:2068)
        at org.apache.hadoop.hdfs.protoc= ol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTrans= ferClient.java:208)
        at org.apache.hadoop.hdfs.protoc= ol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient= .java:159)
        at org.apache.hadoop.hdfs.net.Tc= pPeerServer.peerFromSocketAndKey(TcpPeerServer.java:90)
        at org.apache.hadoop.hdfs.DFSCli= ent.newConnectedPeer(DFSClient.java:3123)
        at org.apache.hadoop.hdfs.BlockR= eaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
        at org.apache.hadoop.hdfs.BlockR= eaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
        at org.apache.hadoop.hdfs.BlockR= eaderFactory.build(BlockReaderFactory.java:337)
        at org.apache.hadoop.hdfs.DFSInp= utStream.blockSeekTo(DFSInputStream.java:576)
        at org.apache.hadoop.hdfs.DFSInp= utStream.readWithStrategy(DFSInputStream.java:800)
        at org.apache.hadoop.hdfs.DFSInp= utStream.read(DFSInputStream.java:848)
        at java.io.DataInputStream.readF= ully(DataInputStream.java:195)
        at org.apache.hadoop.hive.ql.io.= orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:407)
        at org.apache.hadoop.hive.ql.io.= orc.ReaderImpl.<init>(ReaderImpl.java:311)
        at org.apache.hadoop.hive.ql.io.= orc.OrcFile.createReader(OrcFile.java:228)
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFor= mat.java:885)
        at org.apache.hadoop.hive.ql.io.= orc.OrcInputFormat$SplitGenerator.run(OrcInputFormat.java:771)
        at java.util.concurrent.ThreadPo= olExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPo= olExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.j= ava:745)
Caused by: java.lang.InterruptedException
        at java.util.concurrent.FutureTa= sk.awaitDone(FutureTask.java:400)
        at java.util.concurrent.FutureTa= sk.get(FutureTask.java:187)
        at org.apache.hadoop.ipc.Client$= Connection.sendRpcRequest(Client.java:1047)
        at org.apache.hadoop.ipc.Client.= call(Client.java:1442)
        ... 33 more

Our job doesn't always fail sometimes splits get calculated. We suspect whe= n the namenode is too busy our job maybe hits some time-outs and the whole = thing fails.

Our intuition has been the same as you suggest, bigger files is better. But= we see a degradation in performance as soon as our files get bigger than t= he ORC block size. Keeping file size within ORC block size sounds silly but= when looking at the code (OrcInputFormat) we think  it cuts out a bunch of code that is causing us problems. Th= e code we are trying to hit is: https://github.com/apache/hive/blob/release-0.14.0/ql/src/java/org/apache/h= adoop/hive/ql/io/orc/OrcInputFormat.java#L656. Avoiding the scheduling.=

In our case we are not using any SARG but we do use column projection.

Any idea why if we query the data via Hive we don't have this issue?

Let me know if you need more information. Thanks for the insights, muc= h appreciated.

Kind regards,
 Patrick


2016-02-25 22:20 GMT+01:00 Prasanth Jayachan= dran <pjay= achandran@hortonworks.com>:

> On Feb 25, 2016, at 3:15 PM, Prasanth Jayachandran <pjayachandran@hortonworks.com> wro= te:
>
> Hi Patrick
>
> Can you paste entire stacktrace? Looks like NPE happened during split = generation but stack trace is incomplete to know what caused it.
>
> In Hive 0.14.0, the stripe size is changed to 64MB. The default block = size for ORC files is 256MB. 4 stripes can fit a block. ORC does padding to= avoid stripes straddling HDFS blocks. During split calculation, ORC footer= which contains stripe level column statistics is read to perform split pruning based on predicate condition s= pecified via SARG(Search Argument).
>
> For example: Assume column =91state=92 is sorted and the predicate con= dition is =91state=92=3D=93CA"
> Stripe 1: min =3D AZ max =3D FL
> Stripe 2: min =3D GA max =3D MN
> Stripe 3: min =3D MS max =3D SC
> Stripe 4: min =3D SD max =3D WY
>
> In this case, only stripe 1 satisfies the above predicate condition. S= o only 1 split with stripe 1 will be created.
> So if there are huge number of small files, then footers from all file= s has to be read to do split pruning. If there are few number of large file= s then only few footers have to be read. Also the minimum splittable positi= on is stripe boundary. So having fewer large files has the advantage of reading less data during split pruning. >
> If you can send me the full stacktrace, I can tell what is causing the= exception here. I will also let you know of any workaround/next hive versi= on with the fix.
>
> In more recent hive versions, hive 1.2.0 onwards. OrcInputFormat is ha= s strategies to decided when to read footers and when not to read footers a= utomatically. You can configure the strategy that you want based on the wor= kload. In case of many small files, footers will not be read and with large files footers will be read for spl= it pruning.

The default strategy does it automatically (choosing between when to= read and when not to footers). It is configurable as well.

>
> Thanks
> Prasanth
>
>> On Feb 25, 2016, at 7:08 AM, Patrick Duin <patduin@gmail.com> wrote:
>>
>> Hi,
>>
>> We've recently moved one of our datasets to ORC and we use Cascadi= ng and Hive to read this data. We've had problems reading the data via Casc= ading, because of the generation of splits.
>> We read in a large number of files (thousands) and they are about = 1GB each. We found that the split calculation took minutes on our cluster a= nd often didn't succeed at all (when our namenode was busy).
>> When digging through the code of the 'org.apache.hadoop.hive.ql.io= .orc.OrcInputFormat.class' we figured out that if we make the files less th= en the ORC block size (256MB) the code would avoid lots of namenode calls. = We applied this solution and made our files smaller and that solved the problem. Split calculation in our job we= nt from 10+ mins to a couple of seconds and always succeeds.
>> We feel it is counterintuitive as bigger files are usually better = in HDFS. We've also seen that doing a hive query on the data does not prese= nt this problem. Internally Hive seem to take a completely different execut= ion path and is not using the OrcInputFormat but uses 'org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.class'.
>>
>> Can someone explain the reason for this difference or shed some li= ght on the behaviour we are seeing? Any help will be greatly appreciated. W= e are using hive-0.14.0.
>>
>> Kind regards,
>> Patrick
>>
>> Here is the stack-trace that we would see when our Cascading job f= ailed to calculate the splits:
>> Caused by: java.lang.RuntimeException: serious problem
>>        at org.apache.hadoop.hive.ql.io.orc.Orc= InputFormat$Context.waitForTasks(OrcInputFormat.java:478)
>>        at org.apache.hadoop.hive.ql.io.orc.Orc= InputFormat.generateSplitsInfo(OrcInputFormat.java:949)
>>        at org.apache.hadoop.hive.ql.io.orc.Orc= InputFormat.getSplits(OrcInputFormat.java:974)
>>        at com.hotels.corc.mapred.CorcInputForm= at.getSplits(CorcInputFormat.java:201)
>>        at cascading.tap.hadoop.io.MultiInputFo= rmat.getSplits(MultiInputFormat.java:200)
>>        at cascading.tap.hadoop.io.MultiInputFo= rmat.getSplits(MultiInputFormat.java:142)
>>        at org.apache.hadoop.mapreduce.JobSubmi= tter.writeOldSplits(JobSubmitter.java:624)
>>        at org.apache.hadoop.mapreduce.JobSubmi= tter.writeSplits(JobSubmitter.java:616)
>>        at org.apache.hadoop.mapreduce.JobSubmi= tter.submitJobInternal(JobSubmitter.java:492)
>>        at org.apache.hadoop.mapreduce.Job$10.r= un(Job.java:1296)
>>        at org.apache.hadoop.mapreduce.Job$10.r= un(Job.java:1293)
>>        at java.security.AccessController.doPri= vileged(Native Method)
>>        at javax.security.auth.Subject.doAs(Sub= ject.java:415)
>>        at org.apache.hadoop.security.UserGroup= Information.doAs(UserGroupInformation.java:1628)
>>        at org.apache.hadoop.mapreduce.Job.subm= it(Job.java:1293)
>>        at org.apache.hadoop.mapred.JobClient$1= .run(JobClient.java:585)
>>        at org.apache.hadoop.mapred.JobClient$1= .run(JobClient.java:580)
>>        at java.security.AccessController.doPri= vileged(Native Method)
>>        at javax.security.auth.Subject.doAs(Sub= ject.java:415)
>>        at org.apache.hadoop.security.UserGroup= Information.doAs(UserGroupInformation.java:1628)
>>        at org.apache.hadoop.mapred.JobClient.s= ubmitJobInternal(JobClient.java:580)
>>        at org.apache.hadoop.mapred.JobClient.s= ubmitJob(JobClient.java:571)
>>        at cascading.flow.hadoop.planner.Hadoop= FlowStepJob.internalNonBlockingStart(HadoopFlowStepJob.java:106)
>>        at cascading.flow.planner.FlowStepJob.b= lockOnJob(FlowStepJob.java:265)
>>        at cascading.flow.planner.FlowStepJob.s= tart(FlowStepJob.java:184)
>>        at cascading.flow.planner.FlowStepJob.c= all(FlowStepJob.java:146)
>>        at cascading.flow.planner.FlowStepJob.c= all(FlowStepJob.java:48)
>>        ... 4 more
>> Caused by: java.lang.NullPointerException
>>        at org.apache.hadoop.hive.ql.io.orc.Orc= InputFormat$SplitGenerator.run(OrcInputFormat.java:809)
>




DISCLAIMER: The information transmitted is intended only for the person or = entity to which it is addressed and may contain confidential and/or privile= ged material. Any review, re-transmission, dissemination or other use of, o= r taking of any action in reliance upon this information by persons or entities other than the intended recip= ient is prohibited. If you received this in error, please contact the sende= r and destroy any copies of this document and any attachments. --_000_BCC3D1B36FD54CB38FC61AE6236DF569lendingclubcom_--