Mailing-List: contact issues-help@drill.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@drill.apache.org
Date: Fri, 20 Feb 2015 18:59:12 +0000 (UTC)
From: "Rahul Challapalli (JIRA)" <jira@apache.org>
To: issues@drill.apache.org
Message-ID: <JIRA.12776322.1424392597000.97320.1424458752879@Atlassian.JIRA>
In-Reply-To: <JIRA.12776322.1424392597000@Atlassian.JIRA>
References: <JIRA.12776322.1424392597000@Atlassian.JIRA>
 <JIRA.12776322.1424392597553@arcas>
Subject: [jira] [Commented] (DRILL-2274) Unable to allocate sv2 buffer after
 repeated attempts : JOIN, Order by used in query
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/DRILL-2274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329366#comment-14329366 ] 

Rahul Challapalli commented on DRILL-2274:
------------------------------------------

Still hitting the same error after a restart. 

FYI : The join would result in 

2 * 50000 * 50000  = 5 Billion records (we are only projecting one column which is a simple integer)

Cluster Info :

# of nodes : 2
DRILL_MAX_DIRECT_MEMORY="32G"
DRILL_MAX_HEAP="4G"

Let me know if you need anything more

> Unable to allocate sv2 buffer after repeated attempts : JOIN, Order by used in query
> ------------------------------------------------------------------------------------
>
>                 Key: DRILL-2274
>                 URL: https://issues.apache.org/jira/browse/DRILL-2274
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>            Reporter: Rahul Challapalli
>            Assignee: Chris Westin
>             Fix For: 0.9.0
>
>         Attachments: data.json
>
>
> git.commit.id.abbrev=6676f2d
> The below query fails :
> {code}
> select sub1.uid from `data.json` sub1 inner join `data.json` sub2 on sub1.uid = sub2.uid order by sub1.uid;
> {code}
> Error from the logs :
> {code}
> 2015-02-20 00:24:08,431 [2b1981b0-149e-981b-f83f-512c587321d7:frag:1:2] ERROR o.a.d.e.w.f.AbstractStatusReporter - Error 66dba4ff-644c-4400-ab84-203256dc2600: Failure while running fragment.
>  java.lang.RuntimeException: org.apache.drill.exec.memory.OutOfMemoryException: Unable to allocate sv2 buffer after repeated attempts
>  	at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:307) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:96) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:97) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:116) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71]
>  	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71]
>  	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
>  Caused by: org.apache.drill.exec.memory.OutOfMemoryException: Unable to allocate sv2 buffer after repeated attempts
>  	at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.newSV2(ExternalSortBatch.java:516) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:305) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>  	... 16 common frames omitted
> {code}
> On a different drillbit in the cluster, I found the below message for the same run
> {code}
> 2015-02-20 00:24:08,435 [BitServer-6] WARN  o.a.d.exec.rpc.control.WorkEventBus - A fragment message arrived but there was no registered listener for that message: profile {
>    state: FAILED
>    error {
>      error_id: "66dba4ff-644c-4400-ab84-203256dc2600"
>      endpoint {
>        address: "qa-node191.qa.lab"
>        user_port: 31010
>        control_port: 31011
>        data_port: 31012
>      }
>      message: "Failure while running fragment., Unable to allocate sv2 buffer after repeated attempts [ 66dba4ff-644c-4400-ab84-203256dc2600 on qa-node191.qa.lab:31010 ]\n"
>    }
> {code}
> I attached the data file which only has 2 records. I manually copied over the 2 records 50000 times and ran these queries on top of them.
> Let me know if you need anything else


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)