drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Gilmore (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1948) Reading large parquet files via HDFS fails
Date Thu, 08 Jan 2015 06:14:34 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268874#comment-14268874
] 

Adam Gilmore commented on DRILL-1948:
-------------------------------------

Seemed to have worked out the cause.  This line is the ultimate culprit:

CompatibilityUtil.getBuf(input, directBuffer, pageLength);

which ends up doing an input.read(directBuffer) (I couldn't work out where the source for
CompatibilityUtil is)

The fatal mistake that CompatibilityUtil makes, is that it assumes input.read(ByteBuffer)
will always read the remaining bytes in the buffer.  For HDFS, this is not always the case.
 In my instance, it only reads chunks of 64kb (65,535) at a time, thus for large Parquet files,
it's requesting pages of 128kb or so, and only reading 64kb of them.

This compounds by only pushing the position in the stream down to 65,535 on the first page
read, which then lands in the middle of a page and tries to read the page header, hence the
error.

There is probably remedy to force HDFS to return larger chunks, but I'm not quite sure what
setting would do that.  The real fix is to loop input.read() until it returns 0.

> Reading large parquet files via HDFS fails
> ------------------------------------------
>
>                 Key: DRILL-1948
>                 URL: https://issues.apache.org/jira/browse/DRILL-1948
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 0.7.0
>         Environment: Hadoop 2.4.0 on Amazon EMR
>            Reporter: Adam Gilmore
>            Assignee: Parth Chandra
>            Priority: Critical
>
> There appears to be an issue with reading medium to large Parquet files via HDFS.  We
have created a basic Parquet file in with a schema like so:
> sellprice DOUBLE
> When filled with 10,000 double values, the following query in Drill works fine:
> select sum(sellprice) from hdfs.`/saleparquet`;
> When filled with 50,000 double values, the following error occurs:
> Query failed: Query stopped.[ 9aece851-48bc-4664-831e-d35bbfbcd1d5 on ip-10-8-1-70.ap-southeast-2.compute.internal:31010
]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
> The full stack trace is:
> 2015-01-07 05:48:57,809 [2b533736-1ef8-c038-7d3b-f718829e7b74:frag:0:0] ERROR o.a.drill.exec.ops.FragmentContext
- Fragment Context received failure.
> java.lang.ArrayIndexOutOfBoundsException: null
> 2015-01-07 05:48:57,809 [2b533736-1ef8-c038-7d3b-f718829e7b74:frag:0:0] ERROR o.a.d.e.p.i.ScreenCreator$ScreenRoot
- Error 88fe95c3-b088-4674-8b65-967a7f4c3cdf: Query stopped.
> java.lang.ArrayIndexOutOfBoundsException: null
> 2015-01-07 05:48:57,809 [2b533736-1ef8-c038-7d3b-f718829e7b74:frag:0:0] ERROR o.a.d.e.w.f.AbstractStatusReporter
- Error cd4123e4-7b9d-451d-90f0-3cc1ecf461e4: Failure while running fragment.
> java.lang.ArrayIndexOutOfBoundsException: null
> 2015-01-07 05:48:57,813 [2b533736-1ef8-c038-7d3b-f718829e7b74:frag:0:0] ERROR o.a.drill.exec.work.foreman.Foreman
- Error 5db2c65b-cd10-4970-ba2b-f29b51fda923: Query failed: Failure while running fragment.[
cd4123e4-7b9d-451d-90f0-3cc1ecf461e4 on ip-10-8-1-70.ap-southeast-2.compute.internal:31010
]
> [ cd4123e4-7b9d-451d-90f0-3cc1ecf461e4 on ip-10-8-1-70.ap-southeast-2.compute.internal:31010
]
> org.apache.drill.exec.rpc.RemoteRpcException: Failure while running fragment.[ cd4123e4-7b9d-451d-90f0-3cc1ecf461e4
on ip-10-8-1-70.ap-southeast-2.compute.internal:31010 ]
> [ cd4123e4-7b9d-451d-90f0-3cc1ecf461e4 on ip-10-8-1-70.ap-southeast-2.compute.internal:31010
]
>         at org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:93)
[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:151)
[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:113)
[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:109)
[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.internalFail(FragmentExecutor.java:166)
[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:116)
[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_71]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_71]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> 2015-01-07 05:48:57,814 [2b533736-1ef8-c038-7d3b-f718829e7b74:frag:0:0] WARN  o.a.d.e.p.impl.SendingAccountor
- Failure while waiting for send complete.
> java.lang.InterruptedException: null
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1301)
~[na:1.7.0_71]
>         at java.util.concurrent.Semaphore.acquire(Semaphore.java:472) ~[na:1.7.0_71]
>         at org.apache.drill.exec.physical.impl.SendingAccountor.waitForSendComplete(SendingAccountor.java:44)
~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.stop(ScreenCreator.java:186)
[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:144)
[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:117)
[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_71]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_71]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> If I fill with even more values (e.g. 100,000 or 1,000,000) - I get a variety of other
errors, such as:
> "Query failed: Query stopped., don't know what type: 14"
> coming from the Parquet engine.
> I am able to consistently replicate this in my environment with a basic Parquet file.
 I can attach that file if necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message