spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pillis Work <pillis.w...@gmail.com>
Subject Re: About Spark job web ui persist(JIRA-969)
Date Thu, 16 Jan 2014 17:28:15 GMT
Hi Junluan,
1. Yes, we could persist to HDFS or any FS. I think at a minimum we should
persist it to local disk - keeps the core simple.
We can think of HDFS interactions as level-2 functionality that can be
implemented once we have a good local implementation. The
persistence/hydration layer of a SparkContextData can be made pluggable as
a next step.
Also, as mentioned in previous mail, SparkUI will now show multiple
SparkContexts using data from SparkContextDatas.

2. Yes, we could

3. Yes, SparkUI will need a rewrite to deal with SparkContextDatas (either
live, or hydrated from historical JSONs).
Regards




On Thu, Jan 16, 2014 at 8:15 AM, Xia, Junluan <junluan.xia@intel.com> wrote:

> Hi Pillis
>
> It sound goods
> 1. For SparkContextData, I think we could persist in HDFS not in local
> disk(one SparkUI service may show more than one sparkcontext)
> 2. we also could consider SparkContextData as one metrics
> input(MetricsSource), for long running spark job, SparkContextData will
> shown in ganglia/jmx .....
> 3. if we persist SparkContextData periodically, we need to rewrite the UI
> logic as spark ui now just show one timestamp information.
>
> -----Original Message-----
> From: Pillis Work [mailto:pillis.work@gmail.com]
> Sent: Thursday, January 16, 2014 5:37 PM
> To: dev@spark.incubator.apache.org
> Subject: Re: About Spark job web ui persist(JIRA-969)
>
> Hello,
> I wanted to write down at a high level the changes I was thinking of.
> Please feel free to critique and suggest changes.
>
> SparkContext:
> SparkContext start will not be starting UI anymore. Rather it will launch
> a SparkContextObserver (has SparkListener trait) which will generate a
> SparkContextData instance. SparkContextObserver keeps SparkContextData
> uptodate. SparkContextData will have all the historical information anyone
> needs. Stopping a SparkContext stops the SparkContextObserver.
>
> SparkContextData:
> Has all historical information of a SparkContext run. Periodically
> persists itself to disk as JSON. Can hydrate itself from the same JSON.
> SparkContextDatas are created without any UI usage. SparkContextData can
> evolve independently of what UI needs - like having non-UI data needed for
> third party integration.
>
> SparkUI:
> No longer needs SparkContext. Will need an array of SparkContextDatas
> (either by polling folder or other means). UI pages at render time will
> access appropriate SparkContextData and produce HTML. SparkUI can be
> started and stopped independently of SparkContexts. Multiple SparkContexts
> can be shown in UI.
>
> I have purposefully not gone into much detail. Please let me know if any
> piece needs to be elaborated.
> Regards,
> Pillis
>
>
>
>
> On Mon, Jan 13, 2014 at 1:32 PM, Patrick Wendell <pwendell@gmail.com>
> wrote:
>
> > Pillis - I agree we need to decouple the representation from a
> > particular history server. But why not provide as simple history
> > server people can (optionally) run if they aren't using Yarn or Mesos?
> > For people running the standalone cluster scheduler this seems
> > important. Giving them only a JSON dump isn't super consumable for
> > most users.
> >
> > - Patrick
> >
> > On Mon, Jan 13, 2014 at 10:43 AM, Pillis Work <pillis.work@gmail.com>
> > wrote:
> > > The listeners in SparkUI which update the counters can trigger saves
> > along
> > > the way.
> > > The save can be on a 500ms delay after the last update, to batch
> changes.
> > > This solution would not require save on stop().
> > >
> > >
> > >
> > > On Mon, Jan 13, 2014 at 6:15 AM, Tom Graves <tgraves_cs@yahoo.com>
> > wrote:
> > >
> > >> So the downside to just saving stuff at the end is that if the app
> > crashes
> > >> or exits badly you don't have anything.   Hadoop has taken the
> approach
> > of
> > >> saving events along the way.  But Hadoop also uses that history
> > >> file to start where it left off at if something bad happens and it
> > >> gets
> > restarted.
> > >>  I don't think the latter really applies to spark though.
> > >>
> > >> Does mesos have a history server?
> > >>
> > >> Tom
> > >>
> > >>
> > >>
> > >> On Sunday, January 12, 2014 9:22 PM, Pillis Work
> > >> <pillis.work@gmail.com
> > >
> > >> wrote:
> > >>
> > >> IMHO from a pure Spark standpoint, I don't know if having a
> > >> dedicated history service makes sense as of now - considering that
> > >> cluster
> > managers
> > >> have their own history servers. Just showing UI of history runs
> > >> might be too thin a requirement for a full service. Spark should
> > >> store history information that can later be exposed in required ways.
> > >>
> > >> Since each SparkContext is the logical entry and exit point for
> > >> doing something useful in Spark, during its stop(), it should
> > >> serialize that run's statistics into a JSON file - like
> > "sc_run_[name]_[start-time].json".
> > >> When SparkUI.stop() is called, it in turn asks its UI objects
> > >> (which
> > should
> > >> implement a trait) to provide either a flat or hierarchical Map of
> > String
> > >> key/value pairs. This map (flat, hierarchical) is then serialized
> > >> to a configured path (default being "var/history").
> > >>
> > >> With regards to Mesos or YARN, their applications during shutdown
> > >> can
> > have
> > >> API to import this Spark history into their history servers - by
> > >> making
> > API
> > >> calls etc.
> > >>
> > >> This way Spark's history information is persisted independent of
> > >> cluster framework, and cluster frameworks can import the history
> when/as needed.
> > >> Hope this helps.
> > >> Regards,
> > >> pillis
> > >>
> > >>
> > >>
> > >> On Thu, Jan 9, 2014 at 6:13 AM, Tom Graves <tgraves_cs@yahoo.com>
> > wrote:
> > >>
> > >> > Note that it looks like we are planning on adding support for
> > application
> > >> > specific frameworks to YARN sooner rather then later. There is an
> > initial
> > >> > design up here: https://issues.apache.org/jira/browse/YARN-1530.
> > >> > Note this has not been reviewed yet so changes are likely but
> > >> > gives an
> > idea of
> > >> > the general direction.  If anyone has comments on how that might
> > >> > work
> > >> with
> > >> > SPARK I encourage you to post to the jira.
> > >> >
> > >> > As Sandy mentioned it would be very nice if the solution could be
> > >> > compatible with that.
> > >> >
> > >> > Tom
> > >> >
> > >> >
> > >> >
> > >> > On Wednesday, January 8, 2014 12:44 AM, Sandy Ryza <
> > >> > sandy.ryza@cloudera.com> wrote:
> > >> >
> > >> > Hey,
> > >> >
> > >> > YARN-321 is targeted for the Hadoop 2.4.  The minimum feature set
> > doesn't
> > >> > include application-specific data, so that probably won't be part
> > >> > of
> > 2.4
> > >> > unless other things delay the release for a while.  There are no
> > >> > APIs
> > for
> > >> > it yet and pluggable UIs have been discussed but not agreed upon.
> > >> > I
> > >> think
> > >> > requirements from Spark could be useful in helping shape what
> > >> > gets
> > done
> > >> > there.
> > >> >
> > >> > -Sandy
> > >> >
> > >> >
> > >> >
> > >> > On Tue, Jan 7, 2014 at 4:13 PM, Patrick Wendell
> > >> > <pwendell@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Hey Sandy,
> > >> > >
> > >> > > Do you know what the status is for YARN-321 and what version
of
> > >> > > YARN it's targeted for? Also, is there any kind of
> > >> > > documentation or API
> > for
> > >> > > this? Does it control the presentation of the data itself (e.g.
> > >> > > it actually has its own UI)?
> > >> > >
> > >> > > @Tom - having an optional history server sounds like a good idea.
> > >> > >
> > >> > > One question is what format to use for storing the data and how
> > >> > > the persisted format relates to XML/HTML generation in the live
> > >> > > UI. One idea would be to add JSON as an intermediate format
> > >> > > inside of the current WebUI, and then any JSON page could be
> > >> > > persisted and
> > rendered
> > >> > > by the history server using the same code. Once a SparkContext
> > >> > > exits it could dump a series of named paths each with a JSON
> > >> > > file. Then
> > the
> > >> > > history server could load those paths and pass them through the
> > second
> > >> > > rendering stage (JSON => XML) to create each page.
> > >> > >
> > >> > > It would be good if SPARK-969 had a good design doc before
> > >> > > anyone starts working on it.
> > >> > >
> > >> > > - Patrick
> > >> > >
> > >> > > On Tue, Jan 7, 2014 at 3:18 PM, Sandy Ryza
> > >> > > <sandy.ryza@cloudera.com
> > >
> > >> > > wrote:
> > >> > > > As a sidenote, it would be nice to make sure that whatever
> > >> > > > done
> > here
> > >> > will
> > >> > > > work with the YARN Application History Server (YARN-321),
a
> > generic
> > >> > > history
> > >> > > > server that functions similarly to MapReduce's JobHistoryServer.
> >  It
> > >> > will
> > >> > > > eventually have the ability to store application-specific
data.
> > >> > > >
> > >> > > > -Sandy
> > >> > > >
> > >> > > >
> > >> > > > On Tue, Jan 7, 2014 at 2:51 PM, Tom Graves
> > >> > > > <tgraves_cs@yahoo.com>
> > >> > wrote:
> > >> > > >
> > >> > > >> I don't think you want to save the html/xml files. I
would
> > >> > > >> rather
> > >> see
> > >> > > the
> > >> > > >> info saved into a history file in like a json format
that
> > >> > > >> could
> > then
> > >> > be
> > >> > > >> re-read and the web ui display the info, hopefully without
> > >> > > >> much
> > >> change
> > >> > > to
> > >> > > >> the UI parts.  For instance perhaps the history server
could
> > >> > > >> read
> > >> the
> > >> > > file
> > >> > > >> and populate the appropriate Spark data structures that
the
> > >> > > >> web
> > ui
> > >> > > already
> > >> > > >> uses.
> > >> > > >>
> > >> > > >> I would suggest making it so the history server is an
> > >> > > >> optional
> > >> server
> > >> > > and
> > >> > > >> could be run on any node. That way if the load on a
> > >> > > >> particular
> > node
> > >> > > becomes
> > >> > > >> to much it could be moved, but you also could run it
on the
> > >> > > >> same
> > >> node
> > >> > as
> > >> > > >> the Master.  All it really needs to know is where to
get the
> > history
> > >> > > files
> > >> > > >> from and have access to that location.
> > >> > > >>
> > >> > > >> Hadoop actually has a history server for MapReduce which
> > >> > > >> works
> > very
> > >> > > >> similar to what I mention above.   One thing to keep
in minds
> > here
> > >> is
> > >> > > >> security.  You want to make sure that the history files
can
> > >> > > >> only
> > be
> > >> > > read by
> > >> > > >> users who have the appropriate permissions.  The history
> > >> > > >> server
> > >> itself
> > >> > > >> could run as  a superuser who has permission to server
up
> > >> > > >> the
> > files
> > >> > > based
> > >> > > >> on the acls.
> > >> > > >>
> > >> > > >>
> > >> > > >>
> > >> > > >> On Tuesday, January 7, 2014 8:06 AM, "Xia, Junluan"
<
> > >> > > junluan.xia@intel.com>
> > >> > > >> wrote:
> > >> > > >>
> > >> > > >> Hi all
> > >> > > >>          Spark job web ui will not be available when
job is
> > >> > > >> over,
> > >> but
> > >> > it
> > >> > > >> is convenient for developer to debug with persisting
job web
> ui.
> > I
> > >> > just
> > >> > > >> come up with draft for this issue.
> > >> > > >>
> > >> > > >> 1.       We could simply save the web page with html/xml
> > >> > > >> format(stages/executors/storages/environment) to certain
> > >> > > >> location
> > >> when
> > >> > > job
> > >> > > >> finished
> > >> > > >>
> > >> > > >> 2.       But it is not easy for user to review the job
info
> with
> > #1,
> > >> > we
> > >> > > >> could build extra job history service for developers
> > >> > > >>
> > >> > > >> 3.       But where will we build this history service?
In
> Driver
> > >> node
> > >> > or
> > >> > > >> Master node?
> > >> > > >>
> > >> > > >> Any suggestions about this improvement?
> > >> > > >>
> > >> > > >> regards,
> > >> > > >> Andrew
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message