drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5669) Multiple TPCH queries failed due to OOM
Date Tue, 18 Jul 2017 20:10:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092102#comment-16092102
] 

ASF GitHub Bot commented on DRILL-5669:
---------------------------------------

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/879
  
    Did we want to increase the default max memory per query from 2 GB to 4 GB as Aman suggested?


> Multiple TPCH queries failed due to OOM
> ---------------------------------------
>
>                 Key: DRILL-5669
>                 URL: https://issues.apache.org/jira/browse/DRILL-5669
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>         Environment: RHEL 6.4 2.6.32-358.el6.x86_64, 10+1 nodes cluster
>            Reporter: Dechang Gu
>            Assignee: Boaz Ben-Zvi
>             Fix For: 1.11.0
>
>         Attachments: 26999476-174e-98fd-e21e-fd53f79284c7.sys.drill
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Running TPCH SF100 Parquet (and CSV) tests, multiple queries failed due to OOM. For example,
Q16 hit the following error:
> {code}
> java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory while executing
the query.
> Unable to allocate sv2 for 65536 records, and not enough batchGroups to spill.
> batchGroups.size 1
> spilledBatchGroups.size 0
> allocated memory 23500416
> allocator limit 20000000
> Fragment 1:11
> [Error Id: e58161a6-2383-48b1-a350-50db1b5408c6 on ucs-node10.perf.lab:31010]
>         at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489)
>         at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:593)
>         at org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:215)
>         at org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:140)
>         at PipSQueak.fetchRows(PipSQueak.java:420)
>         at PipSQueak.runTest(PipSQueak.java:116)
>         at PipSQueak.main(PipSQueak.java:556)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One
or more nodes ran out of memory while executing the query.
> Unable to allocate sv2 for 65536 records, and not enough batchGroups to spill.
> batchGroups.size 1
> spilledBatchGroups.size 0
> allocated memory 23500416
> allocator limit 20000000
> Fragment 1:11
> {code}
> And in drillbit.log:
> {code}
> 2017-07-12 11:34:11,670 ucs-node10.perf.lab [26999476-174e-98fd-e21e-fd53f79284c7:frag:1:11]
INFO  o.a.d.e.p.i.xsort.ExternalSortBatch - User Error Occurred: One or more nodes ran out
of memory while executing the query.
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more nodes ran
out of memory while executing the query.
> Unable to allocate sv2 for 65536 records, and not enough batchGroups to spill.
> batchGroups.size 1
> spilledBatchGroups.size 0
> allocated memory 23500416
> allocator limit 20000000
> [Error Id: e58161a6-2383-48b1-a350-50db1b5408c6 ]
>         at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550)
~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.newSV2(ExternalSortBatch.java:639)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:381)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext(StreamingAggBatch.java:140)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:105)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:144)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:95)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:234)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:227)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at java.security.AccessController.doPrivileged(Native Method) [na:1.7.0_65]
>         at javax.security.auth.Subject.doAs(Subject.java:415) [na:1.7.0_65]
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
[hadoop-common-2.7.0-mapr-1607.jar:na]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:227)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_65]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_65]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message