spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ye Zhou (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-21961) Filter out BlockStatuses Accumulators during replaying history logs in Spark History Server
Date Tue, 19 Sep 2017 18:48:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-21961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172162#comment-16172162
] 

Ye Zhou commented on SPARK-21961:
---------------------------------

[~zsxwing] Can you help to take a look? Thanks.

> Filter out BlockStatuses Accumulators during replaying history logs in Spark History
Server
> -------------------------------------------------------------------------------------------
>
>                 Key: SPARK-21961
>                 URL: https://issues.apache.org/jira/browse/SPARK-21961
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.1.0, 2.2.0
>            Reporter: Ye Zhou
>         Attachments: Objects_Count_in_Heap.png, One_Thread_Took_24GB.png
>
>
> As described in SPARK-20923, TaskMetrics._updatedBlockStatuses uses a lot of memory in
Driver. Recently we also noticed the same issue in Spark History Server. Even though in SPARK-20084,
those event logs are getting removed from history log. But multiple versions of Spark including
1.6.x and 2.1.0 versions are deployed in our production cluster, none of them have these two
patches included.
> In this case, those event logs will still be in shown up in logs and Spark History Server
will replay them. Spark History Server continuously get severe Full GCs even though we tried
to limit cache size as well as enlarge the heapsize to 40GB. We also tried with different
GC tuning parameters, like using CMS or G1GC. None of them works.
> We made a heap dump, and found that the top memory consumer objects is BlockStatus. There
was even one thread that took 23GB heap which was replaying one log file.
> Since the former two tickets has resolved related issues in both driver and writing to
history logs, we should also consider add this filter to Spark History Server in order to
decrease the memory consumption for replaying one history log. For use cases like us, where
we have multiple older versions of Spark deployed, this filter should be pretty useful.
> We have deployed our Spark History Server with this filter which works fine in our production
cluster, which has processed thousands of logs and only got several full GC in total.
> !https://issues.apache.org/jira/secure/attachment/12886191/Objects_Count_in_Heap.png!
> !https://issues.apache.org/jira/secure/attachment/12886190/One_Thread_Took_24GB.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message