spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From andrewor14 <...@git.apache.org>
Subject [GitHub] spark pull request: [SPARK-12155] Fix executor OOM in unified memo...
Date Thu, 10 Dec 2015 02:49:45 GMT
GitHub user andrewor14 opened a pull request:

    https://github.com/apache/spark/pull/10240

    [SPARK-12155] Fix executor OOM in unified memory management

    **Problem.** In unified memory management, acquiring execution memory may lead to eviction
of storage memory. However, the space freed from evicting cached blocks is distributed among
all active tasks. Thus, an incorrect upper bound on the execution memory per task can cause
the acquisition to fail, leading to OOM's and premature spills.
    
    **Example.** Suppose total memory is 1000B, cached blocks occupy 900B, `spark.memory.storageFraction`
is 0.4, and there are two active tasks. In this case, the cap on task execution memory is
100B / 2 = 50B. If task A tries to acquire 200B, it will evict 100B of storage but can only
acquire 50B because of the incorrect cap. For another example, see this [regression test](https://github.com/andrewor14/spark/blob/fix-oom/core/src/test/scala/org/apache/spark/memory/UnifiedMemoryManagerSuite.scala#L233).
    
    **Solution.** Fix the cap on task execution memory. It should take into account the space
that could have been freed by storage in addition to the current amount of memory available
to execution. In the example above, the correct cap would have been 600B / 2 = 300B.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andrewor14/spark fix-oom

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10240.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10240
    
----
commit 68337754b8541d8ac497c19ae531a79a91904708
Author: Andrew Or <andrew@databricks.com>
Date:   2015-12-09T23:45:48Z

    Pass in callbacks (gross)

commit 35392f5e5e8c152e8cf516ffb4f7c8def0df8361
Author: Andrew Or <andrew@databricks.com>
Date:   2015-12-10T00:02:18Z

    Notify all on task completion

commit cd0c680161e9c6f8044401ec1c2c3a4e83d4b6a1
Author: Andrew Or <andrew@databricks.com>
Date:   2015-12-10T02:10:13Z

    Rename silly method names + add detailed comments

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message