spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jmwdpk <...@git.apache.org>
Subject [GitHub] spark pull request #20823: [SPARK-23674] Add Spark ML Listener for Tracking ...
Date Wed, 14 Mar 2018 17:58:55 GMT
GitHub user jmwdpk opened a pull request:

    https://github.com/apache/spark/pull/20823

    [SPARK-23674] Add Spark ML Listener for Tracking ML Pipeline Status

    ## What changes were proposed in this pull request?
    
    In order to keep track of the status of Spark ML pipeline, trait MLListenEvent to monitor
the [jira](https://issues.apache.org/jira/browse/SPARK-23674) proposed events was added; trait
MLListener  used onEvent method to overide doPostEvent in ListenerBus and post the events
to specific listener. 
    
    In Pipeline.scala, PipelineStage now extends with ListenerBus, so that the related events
can be posted to specific listener by doPostEvent. All pipeline related events were posted
to all registered listeners by postToAll
    
    In ReadWrite.scala, MLWriter now extends with ListenerBus, so that the save-related events
can be posted to specific listener
    
    ## How was this patch tested?
    
    When testing the features, a recorder was created as a mutable buffer to catch/listen
to the actual pipeline events, all the events were added to the recorder by the overridden
onEvent method in the newly created MLListener, which was add to the object(pipeline/newPipelineModel/pipelineWritter/pipelineModelWritter)
corresponding to the operation(fit/transform/save/save) associated with each type of event,
finally the actual captured events were compared with the expected events specified in the
tests.
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jmwdpk/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20823.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20823
    
----
commit 7097400b0453a0cd2ab3fa7128cae247797030de
Author: Ming Jiang <mjiang@...>
Date:   2018-03-14T16:56:21Z

    added MLListener.scala with trait MLListenEvent to monitor the jira proposed events, added
postToAll in  Pipeline.scala so that pipeline related events were posted to all registered
listeners, added pipelineJJobTracker test case

commit f664b527cdf05a53766bb0bd3009a0cb15833f41
Author: Ming Jiang <mjiang@...>
Date:   2018-03-14T17:37:48Z

    added test cases: pipeline model transform tracker, Pipeline read/write tracker, PipelineModel
read/write tracker

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message