spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Johnston (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-14389) OOM during BroadcastNestedLoopJoin
Date Thu, 07 Apr 2016 19:56:25 GMT

    [ https://issues.apache.org/jira/browse/SPARK-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230938#comment-15230938
] 

Steve Johnston commented on SPARK-14389:
----------------------------------------

While watching local disk and hdfs there is very little disk that is consumed before the OOM
exception occurs. For example, du locally doesn't show any movement and dfsadmin the following
just prior to the OOM:
{code}
Configured Capacity: 79400804352 (73.95 GB)
Present Capacity: 77983436380 (72.63 GB)
DFS Remaining: 77748433480 (72.41 GB)
DFS Used: 235002900 (224.12 MB)
DFS Used%: 0.30%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
{code}
Prior to the launch of the job:
{code}
Configured Capacity: 79400804352 (73.95 GB)
Present Capacity: 77966117609 (72.61 GB)
DFS Remaining: 77759155534 (72.42 GB)
DFS Used: 206962075 (197.37 MB)
DFS Used%: 0.27%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
{code}


> OOM during BroadcastNestedLoopJoin
> ----------------------------------
>
>                 Key: SPARK-14389
>                 URL: https://issues.apache.org/jira/browse/SPARK-14389
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.0
>         Environment: OS: Amazon Linux AMI 2015.09
> EMR: 4.3.0
> Hadoop: Amazon 2.7.1
> Spark 1.6.0
> Ganglia 3.7.2
> Master: m3.xlarge
> Core: m3.xlarge
> m3.xlarge: 4 CPU, 15GB mem, 2x40GB SSD
>            Reporter: Steve Johnston
>         Attachments: lineitem.tbl, plans.txt, sample_script.py, stdout.txt
>
>
> When executing attached sample_script.py in client mode with a single executor an exception
occurs, "java.lang.OutOfMemoryError: Java heap space", during the self join of a small table,
TPC-H lineitem generated for a 1M dataset. Also see execution log stdout.txt attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message