spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Johnston (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-14389) OOM during BroadcastNestedLoopJoin
Date Tue, 05 Apr 2016 19:49:25 GMT

    [ https://issues.apache.org/jira/browse/SPARK-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227001#comment-15227001
] 

Steve Johnston commented on SPARK-14389:
----------------------------------------

[~hvanhovell] Your correct with exception to step 3 which should be "Repartion to 1 partition"

This *is* a simplified query to isolate/reproduce the problem. 
Here's the Physical Plan. The other plans are available in {{plans.txt}} attached.
{noformat}
== Physical Plan ==
BroadcastNestedLoopJoin BuildLeft, Inner, None
:- InMemoryColumnarTableScan [l_orderkey#0,l_partkey#1,l_suppkey#2,l_linenumber#3,l_quantity#4,l_extendedprice#5,l_discount#6,l_tax#7,l_returnflag#8,l_linestatus#9,l_shipdate#10,l_commitdate#11,l_receiptdate#12,l_shipinstruct#13,l_shipmode#14,l_comment#15],
InMemoryRelation [l_orderkey#0,l_partkey#1,l_suppkey#2,l_linenumber#3,l_quantity#4,l_extendedprice#5,l_discount#6,l_tax#7,l_returnflag#8,l_linestatus#9,l_shipdate#10,l_commitdate#11,l_receiptdate#12,l_shipinstruct#13,l_shipmode#14,l_comment#15],
true, 10000, StorageLevel(false, true, false, false, 1), TungstenExchange RoundRobinPartitioning(1),
None, None
+- InMemoryColumnarTableScan [l_orderkey#744,l_partkey#745,l_suppkey#746,l_linenumber#747,l_quantity#748,l_extendedprice#749,l_discount#750,l_tax#751,l_returnflag#752,l_linestatus#753,l_shipdate#754,l_commitdate#755,l_receiptdate#756,l_shipinstruct#757,l_shipmode#758,l_comment#759],
InMemoryRelation [l_orderkey#744,l_partkey#745,l_suppkey#746,l_linenumber#747,l_quantity#748,l_extendedprice#749,l_discount#750,l_tax#751,l_returnflag#752,l_linestatus#753,l_shipdate#754,l_commitdate#755,l_receiptdate#756,l_shipinstruct#757,l_shipmode#758,l_comment#759],
true, 10000, StorageLevel(false, true, false, false, 1), TungstenExchange RoundRobinPartitioning(1),
None, None
{noformat}

> OOM during BroadcastNestedLoopJoin
> ----------------------------------
>
>                 Key: SPARK-14389
>                 URL: https://issues.apache.org/jira/browse/SPARK-14389
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.0
>         Environment: OS: Amazon Linux AMI 2015.09
> EMR: 4.3.0
> Hadoop: Amazon 2.7.1
> Spark 1.6.0
> Ganglia 3.7.2
> Master: m3.xlarge
> Core: m3.xlarge
> m3.xlarge: 4 CPU, 15GB mem, 2x40GB SSD
>            Reporter: Steve Johnston
>         Attachments: lineitem.tbl, plans.txt, sample_script.py, stdout.txt
>
>
> When executing attached sample_script.py in client mode with a single executor an exception
occurs, "java.lang.OutOfMemoryError: Java heap space", during the self join of a small table,
TPC-H lineitem generated for a 1M dataset. Also see execution log stdout.txt attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message