[ https://issues.apache.org/jira/browse/DRILL-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16173687#comment-16173687
]
ASF GitHub Bot commented on DRILL-5694:
---------------------------------------
Github user Ben-Zvi commented on a diff in the pull request:
https://github.com/apache/drill/pull/938#discussion_r140062933
--- Diff: common/src/main/java/org/apache/drill/common/exceptions/UserException.java ---
@@ -536,6 +542,33 @@ public Builder pushContext(final String name, final double value)
{
* @return user exception
*/
public UserException build(final Logger logger) {
+
+ // To allow for debugging:
+ // A spinner code to make the execution stop here while the file '/tmp/drillspin'
exists
+ // Can be used to attach a debugger, use jstack, etc
+ // The processID of the spinning thread should be in a file like /tmp/spin4148663301172491613.tmp
+ // along with the error message.
+ File spinFile = new File("/tmp/drillspin");
+ if ( spinFile.exists() ) {
+ File tmpDir = new File("/tmp");
+ File outErr = null;
+ try {
+ outErr = File.createTempFile("spin", ".tmp", tmpDir);
+ BufferedWriter bw = new BufferedWriter(new FileWriter(outErr));
+ bw.write("Spinning process: " + ManagementFactory.getRuntimeMXBean().getName()
+ /* After upgrading to JDK 9 - replace with: ProcessHandle.current().getPid()
*/);
+ bw.write("\nError cause: " +
+ (errorType == DrillPBError.ErrorType.SYSTEM ? ("SYSTEM ERROR: " + ErrorHelper.getRootMessage(cause))
: message));
+ bw.close();
+ } catch (Exception ex) {
+ logger.warn("Failed creating a spinner tmp message file: {}", ex);
+ }
+ while (spinFile.exists()) {
+ try { sleep(1_000); } catch (Exception ex) { /* ignore interruptions */ }
--- End diff --
Does query killing cause a user exception ?
> hash agg spill to disk, second phase OOM
> ----------------------------------------
>
> Key: DRILL-5694
> URL: https://issues.apache.org/jira/browse/DRILL-5694
> Project: Apache Drill
> Issue Type: Bug
> Components: Functions - Drill
> Affects Versions: 1.11.0
> Reporter: Chun Chang
> Assignee: Boaz Ben-Zvi
>
> | 1.11.0-SNAPSHOT | d622f76ee6336d97c9189fc589befa7b0f4189d6 | DRILL-5165: For limit
all case, no need to push down limit to scan | 21.07.2017 @ 10:36:29 PDT
> Second phase agg ran out of memory. Not suppose to. Test data currently only accessible
locally.
> /root/drill-test-framework/framework/resources/Advanced/hash-agg/spill/hagg15.q
> Query:
> select row_count, sum(row_count), avg(double_field), max(double_rand), count(float_rand)
from parquet_500m_v1 group by row_count order by row_count limit 30
> Failed with exception
> java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory while executing
the query.
> HT was: 534773760 OOM at Second Phase. Partitions: 32. Estimated batch size: 4849664.
Planned batches: 0. Rows spilled so far: 6459928 Memory limit: 536870912 so far allocated:
534773760.
> Fragment 1:6
> [Error Id: a193babd-f783-43da-a476-bb8dd4382420 on 10.10.30.168:31010]
> (org.apache.drill.exec.exception.OutOfMemoryException) HT was: 534773760 OOM at Second
Phase. Partitions: 32. Estimated batch size: 4849664. Planned batches: 0. Rows spilled so
far: 6459928 Memory limit: 536870912 so far allocated: 534773760.
> org.apache.drill.exec.test.generated.HashAggregatorGen1823.checkGroupAndAggrValues():1175
> org.apache.drill.exec.test.generated.HashAggregatorGen1823.doWork():539
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():168
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.physical.impl.TopN.TopNBatch.innerNext():191
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.physical.impl.BaseRootExec.next():105
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92
> org.apache.drill.exec.physical.impl.BaseRootExec.next():95
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():415
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():227
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
> Caused By (org.apache.drill.exec.exception.OutOfMemoryException) Unable to allocate
buffer of size 4194304 due to memory limit. Current allocation: 534773760
> org.apache.drill.exec.memory.BaseAllocator.buffer():238
> org.apache.drill.exec.memory.BaseAllocator.buffer():213
> org.apache.drill.exec.vector.IntVector.allocateBytes():231
> org.apache.drill.exec.vector.IntVector.allocateNew():211
> org.apache.drill.exec.test.generated.HashTableGen2141.allocMetadataVector():778
> org.apache.drill.exec.test.generated.HashTableGen2141.resizeAndRehashIfNeeded():717
> org.apache.drill.exec.test.generated.HashTableGen2141.insertEntry():643
> org.apache.drill.exec.test.generated.HashTableGen2141.put():618
> org.apache.drill.exec.test.generated.HashAggregatorGen1823.checkGroupAndAggrValues():1173
> org.apache.drill.exec.test.generated.HashAggregatorGen1823.doWork():539
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():168
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.physical.impl.TopN.TopNBatch.innerNext():191
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.physical.impl.BaseRootExec.next():105
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92
> org.apache.drill.exec.physical.impl.BaseRootExec.next():95
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():415
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():227
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
> at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489)
> at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:593)
> at oadd.org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:215)
> at org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:140)
> at org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:220)
> at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:101)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR:
One or more nodes ran out of memory while executing the query.
> HT was: 534773760 OOM at Second Phase. Partitions: 32. Estimated batch size: 4849664.
Planned batches: 0. Rows spilled so far: 6459928 Memory limit: 536870912 so far allocated:
534773760.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
|