drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Victoria Markman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-2865) Drillbit runs out of memory on multiple consecutive CTAS
Date Fri, 10 Jul 2015 22:13:04 GMT

    [ https://issues.apache.org/jira/browse/DRILL-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622973#comment-14622973
] 

Victoria Markman commented on DRILL-2865:
-----------------------------------------

Verified fixed in 1.2

#Thu Jul 09 02:11:10 EDT 2015
git.commit.id.abbrev=72f9469

{code}
2,874,301 rows selected (45.832 seconds)
Closing: org.apache.drill.jdbc.impl.DrillConnectionImpl
apache drill 1.0.0 
"what ever the mind of man can conceive and believe, drill can query"
iteration: 64
1/1          select 
ss_sold_date_sk , 
ss_sold_time_sk , 
ss_item_sk , 
ss_customer_sk , 
ss_cdemo_sk, 
count(*) from store_sales 
group by 
ss_sold_date_sk , 
ss_sold_time_sk , 
ss_item_sk , 
ss_customer_sk , 
ss_cdemo_sk 
;
2,874,301 rows selected (46.066 seconds)
Closing: org.apache.drill.jdbc.impl.DrillConnectionImpl
apache drill 1.0.0 
"a little sql for your nosql"
iteration: 65
1/1          select 
ss_sold_date_sk , 
ss_sold_time_sk , 
ss_item_sk , 
ss_customer_sk , 
ss_cdemo_sk, 
count(*) from store_sales 
group by 
ss_sold_date_sk , 
ss_sold_time_sk , 
ss_item_sk , 
ss_customer_sk , 
ss_cdemo_sk 
;
{code}

> Drillbit runs out of memory on multiple consecutive CTAS
> --------------------------------------------------------
>
>                 Key: DRILL-2865
>                 URL: https://issues.apache.org/jira/browse/DRILL-2865
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 0.9.0
>            Reporter: Victoria Markman
>            Assignee: Steven Phillips
>             Fix For: 1.0.0
>
>         Attachments: drill-2865-no-compression.log, drillbit.log, scritpts.tar
>
>
> Hardware configuration:
>         - single node
>         - 64GB RAM
> Drill configuration
>         DRILL_MAX_DIRECT_MEMORY="8G"
>         DRILL_MAX_HEAP="4G"
>         `planner.enable_multiphase_agg` = false;
>         `store.parquet.block-size` = 134217728;
>         `planner.enable_mux_exchange` = false;
>         `exec.min_hash_table_size` = 67108864;
>         `planner.enable_hashagg` = true; 
>         `planner.width.max_per_node` = 23;
> Aggregation query on TPCDS scale factor 1: 
> {code}
>         select 
>                 ss_sold_date_sk , 
>                 ss_sold_time_sk , 
>                 ss_item_sk , 
>                 ss_customer_sk , 
>                 ss_cdemo_sk, 
>                 count(*) from store_sales
>         group by 
>                 ss_sold_date_sk , 
>                 ss_sold_time_sk , 
>                 ss_item_sk , 
>                 ss_customer_sk , 
>                 ss_cdemo_sk
> ;
> {code}
> 1. Executing CTAS with this query and store.format = 'parquet' fails on iteration #9
with this configuration consistently
> 2. Ran query by itself: 47 iterations successfully
> 3. Ran CTAS with this query and store.format = 'csv': - 30 iterations did not reproduce
the problem
> Attached:
>       - drillbit.log
>       - scripts.tar (contains script that reproduces OOM)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message