spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandeep Pal (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (SPARK-10940) Too many open files Spark Shuffle
Date Thu, 08 Oct 2015 00:04:26 GMT

     [ https://issues.apache.org/jira/browse/SPARK-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sandeep Pal closed SPARK-10940.
-------------------------------
    Resolution: Fixed

Could not able to reproduce after restarting all the machines.

> Too many open files Spark Shuffle
> ---------------------------------
>
>                 Key: SPARK-10940
>                 URL: https://issues.apache.org/jira/browse/SPARK-10940
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle, SQL
>    Affects Versions: 1.5.0
>         Environment: 6 node standalone spark cluster with 1 master and 5 worker nodes
on Centos 6.6 for all nodes. Each node has > 100 GB memory and 36 cores.
>            Reporter: Sandeep Pal
>
> Executing terasort by Spark-SQL on the data generated by teragen in hadoop. Data size
generated is ~456 GB. 
> Terasort passing with --total-executor-cores = 40, where as failing for --total-executor-cores
= 120. 
> I have tried to increase the ulimit to 10k but the problem persists.
> Note: The above failed configuration of 120 cores worked on spark core code on the top
of rdd. The failure is only in case of using Spark SQL.
> Below is the error message from one of the executor node:
> java.io.FileNotFoundException: /tmp/spark-e15993e8-51a4-452a-8b86-da0169445065/executor-0c661152-3837-4711-bba2-2abf4fd15240/blockmgr-973aab72-feb8-4c60-ba3d-1b2ee27a1cc2/3f/temp_shuffle_7741538d-3ccf-4566-869f-265655ca9c90
(Too many open files)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message