Mailing-List: contact user-help@hive.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hive.apache.org
MIME-Version: 1.0
In-Reply-To: <BCC3D1B3-6FD5-4CB3-8FC6-1AE6236DF569@lendingclub.com>
References: 
 <CANsgGk3UN6C=5Z4=tjU=QuLF0huDYWK=RMdRDM+1nkbTG-mYtw@mail.gmail.com>
	<10107504-0098-4284-93C6-DB6CB1B8B0CC@hortonworks.com>
	<C47441C0-68D4-4D60-B0A5-94B0DA0F338F@hortonworks.com>
	<CANsgGk3Xz+U6L_nzfKjbxSs6TrLDZ7eekzP-Sj-yisuXjEf3+Q@mail.gmail.com>
	<BCC3D1B3-6FD5-4CB3-8FC6-1AE6236DF569@lendingclub.com>
Date: Sat, 27 Feb 2016 10:36:50 +0000
Message-ID: 
 <CAJ3fcbCBP2heU3hsG=QTuf=1rec+CujOjcHB6KizO49UYK9HWQ@mail.gmail.com>
Subject: Re: Running hive queries in different queue
From: Mich Talebzadeh <mich.talebzadeh@gmail.com>
To: user@hive.apache.org
Content-Type: multipart/alternative; boundary=001a113dc46652ac0a052cbdfe38

--001a113dc46652ac0a052cbdfe38
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hello.

What Hive client are you using? beeline

Dr Mich Talebzadeh


LinkedIn * https://www.linkedin.com/profile/view?id=3DAAEAAAAWh2gBxianrbJd6=
zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=3DAAEAAAAWh2gBxianrbJd6zP6AcPCCdO=
ABUrV8Pw>*


http://talebzadehmich.wordpress.com


On 27 February 2016 at 01:34, Rajit Saha <rsaha@lendingclub.com> wrote:

> Hi
>
> I want to run hive query in a queue others than "default" queue from hive
> client command line . Can anybody please suggest a way to do it.
>
> Regards
> Rajit
>
> On Feb 26, 2016, at 07:36, Patrick Duin <patduin@gmail.com> wrote:
>
> Hi Prasanth.
>
> Thanks for the quick reply!
>
> The logs don't show much more of the stacktrace I'm afraid:
> java.lang.NullPointerException
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.run(OrcInp=
utFormat.java:809)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java=
:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav=
a:615)
>         at java.lang.Thread.run(Thread.java:745)
>
>
> The stacktrace isn't really the issue though. The NullPointer is a sympto=
m
> caused by not being able to return any stripes, if you look at the line i=
n
> the code it is  because the 'stripes' field is null which should never
> happen. This, we think, is caused by failing namenode network traffic. We
> would have lots of IO warning in the logs saying block's cannot be found =
or
> e.g.:
> 16/02/01 13:20:34 WARN hdfs.BlockReaderFactory: I/O error constructing
> remote block reader.
> java.io.IOException: java.lang.InterruptedException
>         at org.apache.hadoop.ipc.Client.call(Client.java:1448)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1400)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.=
java:232)
>         at com.sun.proxy.$Proxy32.getServerDefaults(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getS=
erverDefaults(ClientNamenodeProtocolTranslatorPB.java:268)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java=
:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI=
mpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvoc=
ationHandler.java:187)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationH=
andler.java:102)
>         at com.sun.proxy.$Proxy33.getServerDefaults(Unknown Source)
>         at
> org.apache.hadoop.hdfs.DFSClient.getServerDefaults(DFSClient.java:1007)
>         at
> org.apache.hadoop.hdfs.DFSClient.shouldEncryptData(DFSClient.java:2062)
>         at
> org.apache.hadoop.hdfs.DFSClient.newDataEncryptionKey(DFSClient.java:2068=
)
>         at
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.=
checkTrustAndSend(SaslDataTransferClient.java:208)
>         at
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.=
peerSend(SaslDataTransferClient.java:159)
>         at
> org.apache.hadoop.hdfs.net.TcpPeerServer.peerFromSocketAndKey(TcpPeerServ=
er.java:90)
>         at
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3123)
>         at
> org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.=
java:755)
>         at
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(Blo=
ckReaderFactory.java:670)
>         at
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:3=
37)
>         at
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576=
)
>         at
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.jav=
a:800)
>         at
> org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:848)
>         at java.io.DataInputStream.readFully(DataInputStream.java:195)
>         at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(Rea=
derImpl.java:407)
>         at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:311)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:228)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAn=
dCacheStripeDetails(OrcInputFormat.java:885)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.run(OrcInp=
utFormat.java:771)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java=
:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav=
a:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.InterruptedException
>         at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:400)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:187)
>         at
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1047)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1442)
>         ... 33 more
>
> Our job doesn't always fail sometimes splits get calculated. We suspect
> when the namenode is too busy our job maybe hits some time-outs and the
> whole thing fails.
>
> Our intuition has been the same as you suggest, bigger files is better.
> But we see a degradation in performance as soon as our files get bigger
> than the ORC block size. Keeping file size within ORC block size sounds
> silly but when looking at the code (OrcInputFormat) we think  it cuts out=
 a
> bunch of code that is causing us problems. The code we are trying to hit
> is:
> https://github.com/apache/hive/blob/release-0.14.0/ql/src/java/org/apache=
/hadoop/hive/ql/io/orc/OrcInputFormat.java#L656.
> Avoiding the scheduling.
>
> In our case we are not using any SARG but we do use column projection.
>
> Any idea why if we query the data via Hive we don't have this issue?
>
> Let me know if you need more information. Thanks for the insights, much
> appreciated.
>
> Kind regards,
>  Patrick
>
>
> 2016-02-25 22:20 GMT+01:00 Prasanth Jayachandran <
> pjayachandran@hortonworks.com>:
>
>>
>> > On Feb 25, 2016, at 3:15 PM, Prasanth Jayachandran <
>> pjayachandran@hortonworks.com> wrote:
>> >
>> > Hi Patrick
>> >
>> > Can you paste entire stacktrace? Looks like NPE happened during split
>> generation but stack trace is incomplete to know what caused it.
>> >
>> > In Hive 0.14.0, the stripe size is changed to 64MB. The default block
>> size for ORC files is 256MB. 4 stripes can fit a block. ORC does padding=
 to
>> avoid stripes straddling HDFS blocks. During split calculation, ORC foot=
er
>> which contains stripe level column statistics is read to perform split
>> pruning based on predicate condition specified via SARG(Search Argument)=
.
>> >
>> > For example: Assume column =E2=80=98state=E2=80=99 is sorted and the p=
redicate
>> condition is =E2=80=98state=E2=80=99=3D=E2=80=9CCA"
>> > Stripe 1: min =3D AZ max =3D FL
>> > Stripe 2: min =3D GA max =3D MN
>> > Stripe 3: min =3D MS max =3D SC
>> > Stripe 4: min =3D SD max =3D WY
>> >
>> > In this case, only stripe 1 satisfies the above predicate condition. S=
o
>> only 1 split with stripe 1 will be created.
>> > So if there are huge number of small files, then footers from all file=
s
>> has to be read to do split pruning. If there are few number of large fil=
es
>> then only few footers have to be read. Also the minimum splittable posit=
ion
>> is stripe boundary. So having fewer large files has the advantage of
>> reading less data during split pruning.
>> >
>> > If you can send me the full stacktrace, I can tell what is causing the
>> exception here. I will also let you know of any workaround/next hive
>> version with the fix.
>> >
>> > In more recent hive versions, hive 1.2.0 onwards. OrcInputFormat is ha=
s
>> strategies to decided when to read footers and when not to read footers
>> automatically. You can configure the strategy that you want based on the
>> workload. In case of many small files, footers will not be read and with
>> large files footers will be read for split pruning.
>>
>> The default strategy does it automatically (choosing between when to rea=
d
>> and when not to footers). It is configurable as well.
>>
>> >
>> > Thanks
>> > Prasanth
>> >
>> >> On Feb 25, 2016, at 7:08 AM, Patrick Duin <patduin@gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> We've recently moved one of our datasets to ORC and we use Cascading
>> and Hive to read this data. We've had problems reading the data via
>> Cascading, because of the generation of splits.
>> >> We read in a large number of files (thousands) and they are about 1GB
>> each. We found that the split calculation took minutes on our cluster an=
d
>> often didn't succeed at all (when our namenode was busy).
>> >> When digging through the code of the
>> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.class' we figured out t=
hat
>> if we make the files less then the ORC block size (256MB) the code would
>> avoid lots of namenode calls. We applied this solution and made our file=
s
>> smaller and that solved the problem. Split calculation in our job went f=
rom
>> 10+ mins to a couple of seconds and always succeeds.
>> >> We feel it is counterintuitive as bigger files are usually better in
>> HDFS. We've also seen that doing a hive query on the data does not prese=
nt
>> this problem. Internally Hive seem to take a completely different execut=
ion
>> path and is not using the OrcInputFormat but uses
>> 'org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.class'.
>> >>
>> >> Can someone explain the reason for this difference or shed some light
>> on the behaviour we are seeing? Any help will be greatly appreciated. We
>> are using hive-0.14.0.
>> >>
>> >> Kind regards,
>> >> Patrick
>> >>
>> >> Here is the stack-trace that we would see when our Cascading job
>> failed to calculate the splits:
>> >> Caused by: java.lang.RuntimeException: serious problem
>> >>        at
>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$Context.waitForTasks(Orc=
InputFormat.java:478)
>> >>        at
>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcIn=
putFormat.java:949)
>> >>        at
>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat=
.java:974)
>> >>        at
>> com.hotels.corc.mapred.CorcInputFormat.getSplits(CorcInputFormat.java:20=
1)
>> >>        at
>> cascading.tap.hadoop.io.MultiInputFormat.getSplits(MultiInputFormat.java=
:200)
>> >>        at
>> cascading.tap.hadoop.io.MultiInputFormat.getSplits(MultiInputFormat.java=
:142)
>> >>        at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.jav=
a:624)
>> >>        at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:6=
16)
>> >>        at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.=
java:492)
>> >>        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
>> >>        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
>> >>        at java.security.AccessController.doPrivileged(Native Method)
>> >>        at javax.security.auth.Subject.doAs(Subject.java:415)
>> >>        at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio=
n.java:1628)
>> >>        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
>> >>        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:585=
)
>> >>        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:580=
)
>> >>        at java.security.AccessController.doPrivileged(Native Method)
>> >>        at javax.security.auth.Subject.doAs(Subject.java:415)
>> >>        at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio=
n.java:1628)
>> >>        at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:580)
>> >>        at
>> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:571)
>> >>        at
>> cascading.flow.hadoop.planner.HadoopFlowStepJob.internalNonBlockingStart=
(HadoopFlowStepJob.java:106)
>> >>        at
>> cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:265)
>> >>        at
>> cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:184)
>> >>        at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:14=
6)
>> >>        at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:48=
)
>> >>        ... 4 more
>> >> Caused by: java.lang.NullPointerException
>> >>        at
>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.run(OrcIn=
putFormat.java:809)
>> >
>>
>>
>
> ------------------------------
> DISCLAIMER: The information transmitted is intended only for the person o=
r
> entity to which it is addressed and may contain confidential and/or
> privileged material. Any review, re-transmission, dissemination or other
> use of, or taking of any action in reliance upon this information by
> persons or entities other than the intended recipient is prohibited. If y=
ou
> received this in error, please contact the sender and destroy any copies =
of
> this document and any attachments.
>

--001a113dc46652ac0a052cbdfe38
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Hello.</div><div><br></div><div>What Hive client are =
you using? beeline</div></div><div class=3D"gmail_extra"><br clear=3D"all">=
<div><div class=3D"gmail_signature"><div dir=3D"ltr"><font color=3D"#000000=
" face=3D"Times New Roman" size=3D"3">

</font><p style=3D"margin:0cm 0cm 0pt"><font color=3D"#000000" face=3D"Cali=
bri" size=3D"3">Dr Mich Talebzadeh</font></p><font color=3D"#000000" face=
=3D"Times New Roman" size=3D"3">

</font><p style=3D"margin:0cm 0cm 0pt"><font color=3D"#000000" face=3D"Cali=
bri" size=3D"3">=C2=A0</font></p><font color=3D"#000000" face=3D"Times New =
Roman" size=3D"3">

</font><p style=3D"margin:0cm 0cm 0pt"><span style=3D"font-family:&quot;Ari=
al&quot;,sans-serif"><font color=3D"#000000" size=3D"3">LinkedIn </font></s=
pan><i><span style=3D"font-family:&quot;Arial&quot;,sans-serif;font-size:10=
pt"><font color=3D"#000000">=C2=A0</font><a href=3D"https://www.linkedin.co=
m/profile/view?id=3DAAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw" target=3D"_bla=
nk"><font color=3D"#0000ff">https://www.linkedin.com/profile/view?id=3DAAEA=
AAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw</font></a></span></i></p><font color=3D=
"#000000" face=3D"Times New Roman" size=3D"3">

</font><p style=3D"margin:0cm 0cm 0pt"><font color=3D"#000000" face=3D"Cali=
bri" size=3D"3">=C2=A0</font></p><font color=3D"#000000" face=3D"Times New =
Roman" size=3D"3">

</font><p style=3D"margin:0cm 0cm 0pt;text-align:justify"><span style=3D"fo=
nt-family:&quot;Arial&quot;,sans-serif;font-size:10pt"><a href=3D"http://ta=
lebzadehmich.wordpress.com/" target=3D"_blank"><font color=3D"#0000ff">http=
://talebzadehmich.wordpress.com</font></a></span></p><font color=3D"#000000=
" face=3D"Times New Roman" size=3D"3">

</font><p style=3D"margin:0cm 0cm 0pt"><span style=3D"font-family:&quot;Ari=
al&quot;,sans-serif;font-size:9pt"><font color=3D"#000000">=C2=A0</font></s=
pan></p><font color=3D"#000000" face=3D"Times New Roman" size=3D"3">

</font></div></div></div>
<br><div class=3D"gmail_quote">On 27 February 2016 at 01:34, Rajit Saha <sp=
an dir=3D"ltr">&lt;<a href=3D"mailto:rsaha@lendingclub.com" target=3D"_blan=
k">rsaha@lendingclub.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmai=
l_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left=
:1ex">


<div dir=3D"auto">
<div>Hi=C2=A0</div>
<div><br>
</div>
<div>I want to run hive query in a queue others than &quot;default&quot; qu=
eue from hive client command line . Can anybody please suggest a way to do =
it.<br>
<br>
Regards
<div>Rajit</div>
</div>
<div><br>
On Feb 26, 2016, at 07:36, Patrick Duin &lt;<a href=3D"mailto:patduin@gmail=
.com" target=3D"_blank">patduin@gmail.com</a>&gt; wrote:<br>
<br>
</div>
<blockquote type=3D"cite">
<div>
<div dir=3D"ltr">
<div>
<div>
<div>
<div>
<div>
<div>
<div>Hi Prasanth.<br>
<br>
</div>
Thanks for the quick reply!<br>
<br>
</div>
The logs don&#39;t show much more of the stacktrace I&#39;m afraid:<br>
java.lang.NullPointerException<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hive.ql.io.=
orc.OrcInputFormat$SplitGenerator.run(OrcInputFormat.java:809)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.ThreadPo=
olExecutor.runWorker(ThreadPoolExecutor.java:1145)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.ThreadPo=
olExecutor$Worker.run(ThreadPoolExecutor.java:615)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.lang.Thread.run(Thread.j=
ava:745)<br>
</div>
<br>
</div>
<br>
</div>
The stacktrace isn&#39;t really the issue though. The NullPointer is a symp=
tom caused by not being able to return any stripes, if you look at the line=
 in the code it is=C2=A0 because the &#39;stripes&#39; field is null which =
should never happen. This, we think, is caused by
 failing namenode network traffic. We would have lots of IO warning in the =
logs saying block&#39;s cannot be found or e.g.:<br>
16/02/01 13:20:34 WARN hdfs.BlockReaderFactory: I/O error constructing remo=
te block reader.<br>
java.io.IOException: java.lang.InterruptedException<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.ipc.Client.=
call(Client.java:1448)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.ipc.Client.=
call(Client.java:1400)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.ipc.Protobu=
fRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at com.sun.proxy.$Proxy32.getSer=
verDefaults(Unknown Source)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.protoc=
olPB.ClientNamenodeProtocolTranslatorPB.getServerDefaults(ClientNamenodePro=
tocolTranslatorPB.java:268)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at sun.reflect.NativeMethodAcces=
sorImpl.invoke0(Native Method)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at sun.reflect.NativeMethodAcces=
sorImpl.invoke(NativeMethodAccessorImpl.java:57)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at sun.reflect.DelegatingMethodA=
ccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.lang.reflect.Method.invo=
ke(Method.java:606)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.io.retry.Re=
tryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.io.retry.Re=
tryInvocationHandler.invoke(RetryInvocationHandler.java:102)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at com.sun.proxy.$Proxy33.getSer=
verDefaults(Unknown Source)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.DFSCli=
ent.getServerDefaults(DFSClient.java:1007)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.DFSCli=
ent.shouldEncryptData(DFSClient.java:2062)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.DFSCli=
ent.newDataEncryptionKey(DFSClient.java:2068)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.protoc=
ol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTrans=
ferClient.java:208)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.protoc=
ol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient=
.java:159)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.net.Tc=
pPeerServer.peerFromSocketAndKey(TcpPeerServer.java:90)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.DFSCli=
ent.newConnectedPeer(DFSClient.java:3123)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.BlockR=
eaderFactory.nextTcpPeer(BlockReaderFactory.java:755)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.BlockR=
eaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.BlockR=
eaderFactory.build(BlockReaderFactory.java:337)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.DFSInp=
utStream.blockSeekTo(DFSInputStream.java:576)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.DFSInp=
utStream.readWithStrategy(DFSInputStream.java:800)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.DFSInp=
utStream.read(DFSInputStream.java:848)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.io.DataInputStream.readF=
ully(DataInputStream.java:195)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hive.ql.io.=
orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:407)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hive.ql.io.=
orc.ReaderImpl.&lt;init&gt;(ReaderImpl.java:311)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hive.ql.io.=
orc.OrcFile.createReader(OrcFile.java:228)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hive.ql.io.=
orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFor=
mat.java:885)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hive.ql.io.=
orc.OrcInputFormat$SplitGenerator.run(OrcInputFormat.java:771)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.ThreadPo=
olExecutor.runWorker(ThreadPoolExecutor.java:1145)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.ThreadPo=
olExecutor$Worker.run(ThreadPoolExecutor.java:615)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.lang.Thread.run(Thread.j=
ava:745)<br>
Caused by: java.lang.InterruptedException<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.FutureTa=
sk.awaitDone(FutureTask.java:400)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.FutureTa=
sk.get(FutureTask.java:187)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.ipc.Client$=
Connection.sendRpcRequest(Client.java:1047)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.ipc.Client.=
call(Client.java:1442)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ... 33 more<br>
<br>
Our job doesn&#39;t always fail sometimes splits get calculated. We suspect=
 when the namenode is too busy our job maybe hits some time-outs and the wh=
ole thing fails.
<br>
<br>
</div>
Our intuition has been the same as you suggest, bigger files is better. But=
 we see a degradation in performance as soon as our files get bigger than t=
he ORC block size. Keeping file size within ORC block size sounds silly but=
 when looking at the code (OrcInputFormat)
 we think=C2=A0 it cuts out a bunch of code that is causing us problems. Th=
e code we are trying to hit is:
<a href=3D"https://github.com/apache/hive/blob/release-0.14.0/ql/src/java/o=
rg/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java#L656" target=3D"_blank"=
>
https://github.com/apache/hive/blob/release-0.14.0/ql/src/java/org/apache/h=
adoop/hive/ql/io/orc/OrcInputFormat.java#L656</a>. Avoiding the scheduling.=
<br>
</div>
<div><br>
In our case we are not using any SARG but we do use column projection.<br>
<br>
</div>
<div>Any idea why if we query the data via Hive we don&#39;t have this issu=
e?<br>
</div>
<div><br>
</div>
<div>Let me know if you need more information. Thanks for the insights, muc=
h appreciated.<br>
<br>
</div>
<div>Kind regards,<br>
</div>
<div>=C2=A0Patrick<br>
</div>
<br>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div class=3D"gmail_extra"><br>
<div class=3D"gmail_quote">2016-02-25 22:20 GMT+01:00 Prasanth Jayachandran=
 <span dir=3D"ltr">
&lt;<a href=3D"mailto:pjayachandran@hortonworks.com" target=3D"_blank">pjay=
achandran@hortonworks.com</a>&gt;</span>:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;padding=
-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-l=
eft-style:solid">
<span><br>
&gt; On Feb 25, 2016, at 3:15 PM, Prasanth Jayachandran &lt;<a href=3D"mail=
to:pjayachandran@hortonworks.com" target=3D"_blank">pjayachandran@hortonwor=
ks.com</a>&gt; wrote:<br>
&gt;<br>
&gt; Hi Patrick<br>
&gt;<br>
&gt; Can you paste entire stacktrace? Looks like NPE happened during split =
generation but stack trace is incomplete to know what caused it.<br>
&gt;<br>
&gt; In Hive 0.14.0, the stripe size is changed to 64MB. The default block =
size for ORC files is 256MB. 4 stripes can fit a block. ORC does padding to=
 avoid stripes straddling HDFS blocks. During split calculation, ORC footer=
 which contains stripe level column
 statistics is read to perform split pruning based on predicate condition s=
pecified via SARG(Search Argument).<br>
&gt;<br>
&gt; For example: Assume column =E2=80=98state=E2=80=99 is sorted and the p=
redicate condition is =E2=80=98state=E2=80=99=3D=E2=80=9CCA&quot;<br>
&gt; Stripe 1: min =3D AZ max =3D FL<br>
&gt; Stripe 2: min =3D GA max =3D MN<br>
&gt; Stripe 3: min =3D MS max =3D SC<br>
&gt; Stripe 4: min =3D SD max =3D WY<br>
&gt;<br>
&gt; In this case, only stripe 1 satisfies the above predicate condition. S=
o only 1 split with stripe 1 will be created.<br>
&gt; So if there are huge number of small files, then footers from all file=
s has to be read to do split pruning. If there are few number of large file=
s then only few footers have to be read. Also the minimum splittable positi=
on is stripe boundary. So having fewer
 large files has the advantage of reading less data during split pruning.<b=
r>
&gt;<br>
&gt; If you can send me the full stacktrace, I can tell what is causing the=
 exception here. I will also let you know of any workaround/next hive versi=
on with the fix.<br>
&gt;<br>
&gt; In more recent hive versions, hive 1.2.0 onwards. OrcInputFormat is ha=
s strategies to decided when to read footers and when not to read footers a=
utomatically. You can configure the strategy that you want based on the wor=
kload. In case of many small files,
 footers will not be read and with large files footers will be read for spl=
it pruning.<br>
<br>
</span>The default strategy does it automatically (choosing between when to=
 read and when not to footers). It is configurable as well.<br>
<div>
<div><br>
&gt;<br>
&gt; Thanks<br>
&gt; Prasanth<br>
&gt;<br>
&gt;&gt; On Feb 25, 2016, at 7:08 AM, Patrick Duin &lt;<a href=3D"mailto:pa=
tduin@gmail.com" target=3D"_blank">patduin@gmail.com</a>&gt; wrote:<br>
&gt;&gt;<br>
&gt;&gt; Hi,<br>
&gt;&gt;<br>
&gt;&gt; We&#39;ve recently moved one of our datasets to ORC and we use Cas=
cading and Hive to read this data. We&#39;ve had problems reading the data =
via Cascading, because of the generation of splits.<br>
&gt;&gt; We read in a large number of files (thousands) and they are about =
1GB each. We found that the split calculation took minutes on our cluster a=
nd often didn&#39;t succeed at all (when our namenode was busy).<br>
&gt;&gt; When digging through the code of the &#39;org.apache.hadoop.hive.q=
l.io.orc.OrcInputFormat.class&#39; we figured out that if we make the files=
 less then the ORC block size (256MB) the code would avoid lots of namenode=
 calls. We applied this solution and made our
 files smaller and that solved the problem. Split calculation in our job we=
nt from 10+ mins to a couple of seconds and always succeeds.<br>
&gt;&gt; We feel it is counterintuitive as bigger files are usually better =
in HDFS. We&#39;ve also seen that doing a hive query on the data does not p=
resent this problem. Internally Hive seem to take a completely different ex=
ecution path and is not using the OrcInputFormat
 but uses &#39;org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.class=
9;.<br>
&gt;&gt;<br>
&gt;&gt; Can someone explain the reason for this difference or shed some li=
ght on the behaviour we are seeing? Any help will be greatly appreciated. W=
e are using hive-0.14.0.<br>
&gt;&gt;<br>
&gt;&gt; Kind regards,<br>
&gt;&gt; Patrick<br>
&gt;&gt;<br>
&gt;&gt; Here is the stack-trace that we would see when our Cascading job f=
ailed to calculate the splits:<br>
&gt;&gt; Caused by: java.lang.RuntimeException: serious problem<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.hive.ql.io.orc.Orc=
InputFormat$Context.waitForTasks(OrcInputFormat.java:478)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.hive.ql.io.orc.Orc=
InputFormat.generateSplitsInfo(OrcInputFormat.java:949)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.hive.ql.io.orc.Orc=
InputFormat.getSplits(OrcInputFormat.java:974)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.hotels.corc.mapred.CorcInputForm=
at.getSplits(CorcInputFormat.java:201)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at cascading.tap.hadoop.io.MultiInputFo=
rmat.getSplits(MultiInputFormat.java:200)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at cascading.tap.hadoop.io.MultiInputFo=
rmat.getSplits(MultiInputFormat.java:142)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.mapreduce.JobSubmi=
tter.writeOldSplits(JobSubmitter.java:624)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.mapreduce.JobSubmi=
tter.writeSplits(JobSubmitter.java:616)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.mapreduce.JobSubmi=
tter.submitJobInternal(JobSubmitter.java:492)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.mapreduce.Job$10.r=
un(Job.java:1296)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.mapreduce.Job$10.r=
un(Job.java:1293)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.security.AccessController.doPri=
vileged(Native Method)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at javax.security.auth.Subject.doAs(Sub=
ject.java:415)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.security.UserGroup=
Information.doAs(UserGroupInformation.java:1628)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.mapreduce.Job.subm=
it(Job.java:1293)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.mapred.JobClient$1=
.run(JobClient.java:585)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.mapred.JobClient$1=
.run(JobClient.java:580)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.security.AccessController.doPri=
vileged(Native Method)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at javax.security.auth.Subject.doAs(Sub=
ject.java:415)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.security.UserGroup=
Information.doAs(UserGroupInformation.java:1628)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.mapred.JobClient.s=
ubmitJobInternal(JobClient.java:580)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.mapred.JobClient.s=
ubmitJob(JobClient.java:571)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at cascading.flow.hadoop.planner.Hadoop=
FlowStepJob.internalNonBlockingStart(HadoopFlowStepJob.java:106)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at cascading.flow.planner.FlowStepJob.b=
lockOnJob(FlowStepJob.java:265)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at cascading.flow.planner.FlowStepJob.s=
tart(FlowStepJob.java:184)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at cascading.flow.planner.FlowStepJob.c=
all(FlowStepJob.java:146)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at cascading.flow.planner.FlowStepJob.c=
all(FlowStepJob.java:48)<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 ... 4 more<br>
&gt;&gt; Caused by: java.lang.NullPointerException<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.hive.ql.io.orc.Orc=
InputFormat$SplitGenerator.run(OrcInputFormat.java:809)<br>
&gt;<br>
<br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
<hr>
DISCLAIMER: The information transmitted is intended only for the person or =
entity to which it is addressed and may contain confidential and/or privile=
ged material. Any review, re-transmission, dissemination or other use of, o=
r taking of any action in reliance
 upon this information by persons or entities other than the intended recip=
ient is prohibited. If you received this in error, please contact the sende=
r and destroy any copies of this document and any attachments.
</div>

</blockquote></div><br></div>

--001a113dc46652ac0a052cbdfe38--