spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jkbradley <...@git.apache.org>
Subject [GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans
Date Mon, 24 Oct 2016 20:36:53 GMT
Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/11119
  
    > Would you mind pointing me to an example of an algorithm which only copies some,
but not all, of the estimator params?
    
    ALS is a good example: [https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala#L98]
    
    > [users identifying initialization method]
    
    I agree it's misleading to have mismatched Params initialModel and initMode, especially
if Model.initialModel does not exist.  I'd say this is an ideal solution:
    * (in this PR) Have setInitialModel also set k, initMode, etc. (where we create a new
initMode called "initialModel").
      * Calling setInitMode("initialModel") would probably need to throw an error.  This is
a minor issue IMO.
    * (in a follow-up PR)  The above bullet point has one bigger issue: Setting initialModel
via ```km.set(km.initialModel, initialModel)``` would bypass the setter method and therefore
not set k, initMode, etc. appropriately.  This issue with tied Params has appeared elsewhere
in MLlib as well.  We could implement a fix by having the ```Params.set``` method use Scala
reflection to call the corresponding setter method.  We'd just have to take extra care to
test this well.
      * There are some Params in Models without matching setter methods.  Those were added
with the intention of having Estimator Params easily accessible from Models.  We'll just have
to keep these in mind when writing unit tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message