Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 38342200C2B for ; Thu, 2 Mar 2017 23:01:53 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 36C69160B6F; Thu, 2 Mar 2017 22:01:53 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3158C160B6A for ; Thu, 2 Mar 2017 23:01:52 +0100 (CET) Received: (qmail 68429 invoked by uid 500); 2 Mar 2017 22:01:51 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 68420 invoked by uid 99); 2 Mar 2017 22:01:51 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Mar 2017 22:01:51 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id E674AC139E for ; Thu, 2 Mar 2017 22:01:50 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.547 X-Spam-Level: X-Spam-Status: No, score=-1.547 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-2.999, SPF_NEUTRAL=0.652] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id DLDEs1ILNFFS for ; Thu, 2 Mar 2017 22:01:48 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 732175FB62 for ; Thu, 2 Mar 2017 22:01:47 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 263B4E0901 for ; Thu, 2 Mar 2017 22:01:46 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 9C6132416A for ; Thu, 2 Mar 2017 22:01:45 +0000 (UTC) Date: Thu, 2 Mar 2017 22:01:45 +0000 (UTC) From: "Rahul Challapalli (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-5226) External Sort encountered an error while spilling to disk MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 02 Mar 2017 22:01:53 -0000 [ https://issues.apache.org/jira/browse/DRILL-5226?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1589= 3107#comment-15893107 ]=20 Rahul Challapalli commented on DRILL-5226: ------------------------------------------ Thanks for the explanation [~Paul.Rogers]. It looks like the record batch s= izes are different on your mac and on my mapr cluster. Check the attached l= og for scenario 3. Below is the relevant part {code} 2017-02-27 15:14:18,267 [274b4d38-c654-ba87-54d8-45a2bffd74d1:frag:1:0] DEB= UG o.a.d.e.p.i.x.m.ExternalSortBatch - Actual batch schema & sizes { T0=C2=A6=C2=A6col1(std col. size: 54, actual col. size: 2, total size: 61= 42, data size: 2046, row capacity: 1023, density: 34) T0=C2=A6=C2=A6col2(std col. size: 54, actual col. size: 2, total size: 61= 42, data size: 2046, row capacity: 1023, density: 34) T0=C2=A6=C2=A6col3(std col. size: 54, actual col. size: 2, total size: 61= 42, data size: 2046, row capacity: 1023, density: 34) T0=C2=A6=C2=A6col4(std col. size: 54, actual col. size: 2, total size: 61= 42, data size: 2046, row capacity: 1023, density: 34) T0=C2=A6=C2=A6col5(std col. size: 54, actual col. size: 2, total size: 61= 42, data size: 2046, row capacity: 1023, density: 34) T0=C2=A6=C2=A6col6(std col. size: 54, actual col. size: 2, total size: 61= 42, data size: 2046, row capacity: 1023, density: 34) T0=C2=A6=C2=A6col7(std col. size: 54, actual col. size: 2, total size: 61= 42, data size: 2046, row capacity: 1023, density: 34) T0=C2=A6=C2=A6col8(std col. size: 54, actual col. size: 2, total size: 61= 42, data size: 2046, row capacity: 1023, density: 34) T0=C2=A6=C2=A6col9(std col. size: 54, actual col. size: 2, total size: 61= 42, data size: 2046, row capacity: 1023, density: 34) T0=C2=A6=C2=A6col10(std col. size: 54, actual col. size: 2, total size: 6= 142, data size: 2046, row capacity: 1023, density: 34) T0=C2=A6=C2=A6col11(std col. size: 54, actual col. size: 8, total size: 1= 2280, data size: 8184, row capacity: 1023, density: 67) col1(std col. size: 54, actual col. size: 2, total size: 53248, data size= : 2046, row capacity: 4095, density: 4) col2(std col. size: 54, actual col. size: 2, total size: 53248, data size= : 2046, row capacity: 4095, density: 4) col3(std col. size: 54, actual col. size: 2, total size: 53248, data size= : 2046, row capacity: 4095, density: 4) col4(std col. size: 54, actual col. size: 2, total size: 53248, data size= : 2046, row capacity: 4095, density: 4) col5(std col. size: 54, actual col. size: 2, total size: 53248, data size= : 2046, row capacity: 4095, density: 4) col6(std col. size: 54, actual col. size: 2, total size: 53248, data size= : 2046, row capacity: 4095, density: 4) col7(std col. size: 54, actual col. size: 2, total size: 53248, data size= : 2046, row capacity: 4095, density: 4) col8(std col. size: 54, actual col. size: 2, total size: 53248, data size= : 2046, row capacity: 4095, density: 4) col9(std col. size: 54, actual col. size: 2, total size: 53248, data size= : 2046, row capacity: 4095, density: 4) col10(std col. size: 54, actual col. size: 2, total size: 53248, data siz= e: 2046, row capacity: 4095, density: 4) col11(std col. size: 54, actual col. size: 8, total size: 53248, data siz= e: 8184, row capacity: 4095, density: 16) Records: 1023, Total size: 661476, Row width:648, Density:22} {code} The above recorded is smaller than the allocated memory. I believe this sit= uation still needs an explanation. What do you think? > External Sort encountered an error while spilling to disk > --------------------------------------------------------- > > Key: DRILL-5226 > URL: https://issues.apache.org/jira/browse/DRILL-5226 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators > Affects Versions: 1.10.0 > Reporter: Rahul Challapalli > Assignee: Paul Rogers > Attachments: 277578d5-8bea-27db-0da1-cec0f53a13df.sys.drill, prof= ile_scenario3.sys.drill, scenario3.log > > > Environment :=20 > {code} > git.commit.id.abbrev=3D2af709f > DRILL_MAX_DIRECT_MEMORY=3D"32G" > DRILL_MAX_HEAP=3D"4G" > Nodes in Mapr Cluster : 1 > Data Size : ~ 0.35 GB > No of Columns : 1 > Width of column : 256 chars > {code} > The below query fails before spilling to disk due to wrong estimates of t= he record batch size. > {code} > 0: jdbc:drill:zk=3D10.10.100.190:5181> alter session set `planner.width.m= ax_per_node` =3D 1; > +-------+--------------------------------------+ > | ok | summary | > +-------+--------------------------------------+ > | true | planner.width.max_per_node updated. | > +-------+--------------------------------------+ > 1 row selected (1.11 seconds) > 0: jdbc:drill:zk=3D10.10.100.190:5181> alter session set `planner.memory.= max_query_memory_per_node` =3D 62914560; > +-------+----------------------------------------------------+ > | ok | summary | > +-------+----------------------------------------------------+ > | true | planner.memory.max_query_memory_per_node updated. | > +-------+----------------------------------------------------+ > 1 row selected (0.362 seconds) > 0: jdbc:drill:zk=3D10.10.100.190:5181> alter session set `planner.disable= _exchanges` =3D true; > +-------+-------------------------------------+ > | ok | summary | > +-------+-------------------------------------+ > | true | planner.disable_exchanges updated. | > +-------+-------------------------------------+ > 1 row selected (0.277 seconds) > 0: jdbc:drill:zk=3D10.10.100.190:5181> select * from (select * from dfs.`= /drill/testdata/resource-manager/250wide-small.tbl` order by columns[0])d w= here d.columns[0] =3D 'ljdfhwuehnoiueyf'; > Error: RESOURCE ERROR: External Sort encountered an error while spilling = to disk > Unable to allocate buffer of size 1048576 (rounded from 618889) due to me= mory limit. Current allocation: 62736000 > Fragment 0:0 > [Error Id: 1bb933c8-7dc6-4cbd-8c8e-0e095baac719 on qa-node190.qa.lab:3101= 0] (state=3D,code=3D0) > {code} > Exception from the logs > {code} > 2017-01-26 15:33:09,307 [277578d5-8bea-27db-0da1-cec0f53a13df:frag:0:0] I= NFO o.a.d.e.p.i.xsort.ExternalSortBatch - User Error Occurred: External So= rt encountered an error while spilling to disk (Unable to allocate buffer o= f size 1048576 (rounded from 618889) due to memory limit. Current allocatio= n: 62736000) > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Externa= l Sort encountered an error while spilling to disk > Unable to allocate buffer of size 1048576 (rounded from 618889) due to me= mory limit. Current allocation: 62736000 > [Error Id: 1bb933c8-7dc6-4cbd-8c8e-0e095baac719 ] > at org.apache.drill.common.exceptions.UserException$Builder.build= (UserException.java:544) ~[drill-common-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT= ] > at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.me= rgeAndSpill(ExternalSortBatch.java:603) [drill-java-exec-1.10.0-SNAPSHOT.ja= r:1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.in= nerNext(ExternalSortBatch.java:411) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.= 10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:162) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.validate.IteratorValidator= BatchIterator.next(IteratorValidatorBatchIterator.java:215) [drill-java-exe= c-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:119) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:109) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNe= xt(AbstractSingleRecordBatch.java:51) [drill-java-exec-1.10.0-SNAPSHOT.jar:= 1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.svremover.RemovingRecordBa= tch.innerNext(RemovingRecordBatch.java:93) [drill-java-exec-1.10.0-SNAPSHOT= .jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:162) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.validate.IteratorValidator= BatchIterator.next(IteratorValidatorBatchIterator.java:215) [drill-java-exe= c-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:119) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:109) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNe= xt(AbstractSingleRecordBatch.java:51) [drill-java-exec-1.10.0-SNAPSHOT.jar:= 1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch= .innerNext(ProjectRecordBatch.java:135) [drill-java-exec-1.10.0-SNAPSHOT.ja= r:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:162) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.validate.IteratorValidator= BatchIterator.next(IteratorValidatorBatchIterator.java:215) [drill-java-exe= c-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:119) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:109) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNe= xt(AbstractSingleRecordBatch.java:51) [drill-java-exec-1.10.0-SNAPSHOT.jar:= 1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:162) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.validate.IteratorValidator= BatchIterator.next(IteratorValidatorBatchIterator.java:215) [drill-java-exe= c-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:119) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:109) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNe= xt(AbstractSingleRecordBatch.java:51) [drill-java-exec-1.10.0-SNAPSHOT.jar:= 1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.svremover.RemovingRecordBa= tch.innerNext(RemovingRecordBatch.java:93) [drill-java-exec-1.10.0-SNAPSHOT= .jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:162) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.validate.IteratorValidator= BatchIterator.next(IteratorValidatorBatchIterator.java:215) [drill-java-exe= c-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:119) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:109) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNe= xt(AbstractSingleRecordBatch.java:51) [drill-java-exec-1.10.0-SNAPSHOT.jar:= 1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch= .innerNext(ProjectRecordBatch.java:135) [drill-java-exec-1.10.0-SNAPSHOT.ja= r:1.10.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(Abstract= RecordBatch.java:162) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.validate.IteratorValidator= BatchIterator.next(IteratorValidatorBatchIterator.java:215) [drill-java-exe= c-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRoot= Exec.java:104) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.i= nnerNext(ScreenCreator.java:81) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0= -SNAPSHOT] > at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRoot= Exec.java:94) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(Fra= gmentExecutor.java:232) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHO= T] > at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(Fra= gmentExecutor.java:226) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHO= T] > at java.security.AccessController.doPrivileged(Native Method) [na= :1.7.0_111] > at javax.security.auth.Subject.doAs(Subject.java:415) [na:1.7.0_1= 11] > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroup= Information.java:1595) [hadoop-common-2.7.0-mapr-1607.jar:na] > at org.apache.drill.exec.work.fragment.FragmentExecutor.run(Fragm= entExecutor.java:226) [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningR= unnable.java:38) [drill-common-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolEx= ecutor.java:1145) [na:1.7.0_111] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolE= xecutor.java:615) [na:1.7.0_111] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111] > Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable t= o allocate buffer of size 1048576 (rounded from 618889) due to memory limit= . Current allocation: 62736000 > at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocato= r.java:216) ~[drill-memory-base-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocato= r.java:191) ~[drill-memory-base-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.cache.VectorAccessibleSerializable.readF= romStream(VectorAccessibleSerializable.java:112) ~[drill-java-exec-1.10.0-S= NAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.xsort.BatchGroup.getBatch(= BatchGroup.java:111) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.xsort.BatchGroup.getNextIn= dex(BatchGroup.java:137) ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPS= HOT] > at org.apache.drill.exec.test.generated.PriorityQueueCopierGen7.n= ext(PriorityQueueCopierTemplate.java:76) ~[na:na] > at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.me= rgeAndSpill(ExternalSortBatch.java:590) [drill-java-exec-1.10.0-SNAPSHOT.ja= r:1.10.0-SNAPSHOT] > ... 45 common frames omitted > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)