spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcelo Vanzin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-18085) Scalability enhancements for the History Server
Date Wed, 26 Oct 2016 16:43:58 GMT

    [ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15608968#comment-15608968
] 

Marcelo Vanzin commented on SPARK-18085:
----------------------------------------

bq. this really sounds very much like the Hadoop ATS V1.5. Have you looked at that?

I haven't looked at 1.5 exactly, but I assume it's not much different from v1. The problem
with the ATS is twofold: it's an external YARN dependency, and its based on "immutable events";
once you write a piece of data to the ATS you can't update it. My spec requires the ability
to update objects stored in the underlying store.

I haven't looked closely at what will be done with v2, but the blurbs I looked at a while
ago didn't really impress me much; the idea of running a separate JVM side-by-side with the
AM sounded weird to me. Also, it still has the YARN dependency which may not be desired by
a bunch of Spark users.

bq. I assume this a separate levelDB that stores just the metadata to do the simple listing
on startup?

Yes, there are n+1 dbs kept by the SHS (listing + 1 per UI).

bq. Just a quick overall picture of what I think is being proposed without all the incremental
steps and leaving out UI parts

That's pretty accurate. As far as cleanup policy, currently the code just cleans up based
on the existing clean up policy; I've been thinking about adding a second cleaning thread
to clean up the local data based on files that were deleted from HDFS outside of the SHS,
but haven't gotten to that yet.


bq. How is this solving quickly listing new apps issue?

It's not, at least not explicitly. Incremental parsing would be a way to handle that, and
also hooking up with HDFS's "inotify" API to detect new files or files being renamed. Or writing
a separate "summary" file somewhere as you suggest. But those enhancements can be done separately
from this work, really (which is why I kept SPARK-6951 a separate issue).

bq. streaming data, I'm not sure if streaming stores history at this point? 

No, streaming doesn't write to the event log, so there's no streaming UI in the SHS. This
work touches streaming because I'm changing the backing store for UI data, but I don't want
to tackle streaming history at this point in time. That can be done separately (and this work
might help make it a more viable idea).

I also don't want to stray too far from the UI / SHS enhancements here. For example, breaking
up large event log files (to make logging streaming events more palatable) would be nice,
but in a sense it's orthogonal to what's being proposed here.

Finally, note that even though I don't explicitly call this out in the document, the proposal
here is less about where the data will be stored and more about changing the underlying architecture
to allow the data to be stored in a different place. If you take a look at the M1 code, there's
an abstraction for an external store, and I just happen to have a LevelDB implementation.
You could potentially implement an in-memory version, or an ATS version, or something else
crazy. But the main thing I want to change is changing the idea that UI data is stored in
memory, which is the source of most of the SHS issues we see.

> Scalability enhancements for the History Server
> -----------------------------------------------
>
>                 Key: SPARK-18085
>                 URL: https://issues.apache.org/jira/browse/SPARK-18085
>             Project: Spark
>          Issue Type: Umbrella
>          Components: Spark Core, Web UI
>    Affects Versions: 2.0.0
>            Reporter: Marcelo Vanzin
>         Attachments: spark_hs_next_gen.pdf
>
>
> It's a known fact that the History Server currently has some annoying issues when serving
lots of applications, and when serving large applications.
> I'm filing this umbrella to track work related to addressing those issues. I'll be attaching
a document shortly describing the issues and suggesting a path to how to solve them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message