spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcelo Vanzin (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-18085) Better History Server scalability for many / large applications
Date Mon, 17 Apr 2017 20:57:41 GMT

    [ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971594#comment-15971594
] 

Marcelo Vanzin edited comment on SPARK-18085 at 4/17/17 8:57 PM:
-----------------------------------------------------------------

I'm getting close to a point where I think the code can start to trickle in. I want to wait
until 2.2's branch gets going before sending PRs, though. In the meantime, I'm keeping "private
PRs" in my fork for each milestone, so it's easy for anybody interested in getting themselves
familiar with the code to provide comments:

https://github.com/vanzin/spark/pulls

At this point, all the UI that the SHS shows is kept in a disk store (that's core + SQL, but
not streaming). At this point, since streaming is not shown in the SHS, I'm not planning to
touch it (aside from the small changes I made that were required by internal API changes in
core).

What's left at this point is, from my view:
- managing disk space in the SHS so that large number of apps don't cause the SHS to fill
local disks
- limiting the number of jobs / stages / tasks / etc kept in the store (similar to existing
settings, which the code doesn't yet honor)
- an in-memory implementation of the store (in case someone wants lower latency or can't /
does not want to use the disk store)
- more tests, and more testing



was (Author: vanzin):
I'm getting close to a point where I think the code can start to trickle in. I want to wait
until 2.2's branch gets going before sending PRs, though. In the meantime, I'm keeping "private
PRs" in my fork for each milestone, so it's easy for anybody interesting in getting themselves
familiar with the code to provide comments:

https://github.com/vanzin/spark/pulls

At this point, all the UI that the SHS shows is kept in a disk store (that's core + SQL, but
not streaming). At this point, since streaming is not shown in the SHS, I'm not planning to
touch it (aside from the small changes I made that were required by internal API changes in
core).

What's left at this point is, from my view:
- managing disk space in the SHS so that large number of apps don't cause the SHS to fill
local disks
- limiting the number of jobs / stages / tasks / etc kept in the store (similar to existing
settings, which the code doesn't yet honor)
- an in-memory implementation of the store (in case someone wants lower latency or can't /
does not want to use the disk store)
- more tests, and more testing


> Better History Server scalability for many / large applications
> ---------------------------------------------------------------
>
>                 Key: SPARK-18085
>                 URL: https://issues.apache.org/jira/browse/SPARK-18085
>             Project: Spark
>          Issue Type: Umbrella
>          Components: Spark Core, Web UI
>    Affects Versions: 2.0.0
>            Reporter: Marcelo Vanzin
>         Attachments: spark_hs_next_gen.pdf
>
>
> It's a known fact that the History Server currently has some annoying issues when serving
lots of applications, and when serving large applications.
> I'm filing this umbrella to track work related to addressing those issues. I'll be attaching
a document shortly describing the issues and suggesting a path to how to solve them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message