drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venki Korukanti (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-2835) IndexOutOfBoundsException in partition sender when doing streaming aggregate with LIMIT
Date Tue, 21 Apr 2015 06:15:58 GMT

     [ https://issues.apache.org/jira/browse/DRILL-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Venki Korukanti updated DRILL-2835:
-----------------------------------
    Attachment: DRILL-2835-1.patch

Issue here is:
One of the outgoing batches in PartitionSender receives a "terminate" message from its receiver,
which causes the OutgoingBatch to go in to"dropAll" mode (meaning ignore sending data to receiver
once a batch is filled). Problem is when in "dropAll" mode, as part of flush we clear the
vector buffers, but don't allocate any new buffers causing the OutgoingBatch copy to fail.

There are two ways to resolve this issue:
1. Before copying a record in each OutgoingBatch check if the batch is in "dropAll" mode.
2. Reuse the same buffers with out releasing by resetting the recordCount. This has overhead
of copying when the OutgoingBatch is in dropAll mode which happens when a LIMIT clause is
used.

> IndexOutOfBoundsException in partition sender when doing streaming aggregate with LIMIT

> ----------------------------------------------------------------------------------------
>
>                 Key: DRILL-2835
>                 URL: https://issues.apache.org/jira/browse/DRILL-2835
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - RPC
>    Affects Versions: 0.8.0
>            Reporter: Aman Sinha
>            Assignee: Venki Korukanti
>         Attachments: DRILL-2835-1.patch
>
>
> Following CTAS run on a TPC-DS 100GB scale factor on a 10-node cluster: 
> {code}
> alter session set `planner.enable_hashagg` = false;
> alter session set `planner.enable_multiphase_agg` = true;
> create table dfs.tmp.stream9 as 
> select cr_call_center_sk , cr_catalog_page_sk ,  cr_item_sk , cr_reason_sk , cr_refunded_addr_sk
, count(*) from catalog_returns_dri100 
>  group by cr_call_center_sk , cr_catalog_page_sk ,  cr_item_sk , cr_reason_sk , cr_refunded_addr_sk
>  limit 100
> ;
> {code}
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: index: 1023, length: 1 (expected: range(0,
0))
>         at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:200) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>         at io.netty.buffer.DrillBuf.chk(DrillBuf.java:222) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>         at io.netty.buffer.DrillBuf.setByte(DrillBuf.java:621) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>         at org.apache.drill.exec.vector.UInt1Vector$Mutator.set(UInt1Vector.java:342)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.vector.NullableBigIntVector$Mutator.set(NullableBigIntVector.java:372)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.vector.NullableBigIntVector.copyFrom(NullableBigIntVector.java:284)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.test.generated.PartitionerGen4$OutgoingRecordBatch.doEval(PartitionerTemplate.java:370)
~[na:na]
>         at org.apache.drill.exec.test.generated.PartitionerGen4$OutgoingRecordBatch.copy(PartitionerTemplate.java:249)
~[na:na]
>         at org.apache.drill.exec.test.generated.PartitionerGen4.doCopy(PartitionerTemplate.java:208)
~[na:na]
>         at org.apache.drill.exec.test.generated.PartitionerGen4.partitionBatch(PartitionerTemplate.java:176)
~[na:na]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message