spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MLnick <...@git.apache.org>
Subject [GitHub] spark issue #18313: [SPARK-21087] [ML] CrossValidator, TrainValidationSplit ...
Date Thu, 03 Aug 2017 12:09:44 GMT
Github user MLnick commented on the issue:

    https://github.com/apache/spark/pull/18313
  
    I commented on the [JIRA](https://issues.apache.org/jira/browse/SPARK-21086?focusedCommentId=16112623&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16112623).

    
    I would like to better understand the use case more of keeping all the models (as per
my JIRA comment). I suspect that #16158 may be the more useful approach in practice.
    
    But overall, if there is a use case for keeping the models, I would agree with @jkbradley's
suggestion that we offer a simple "keep all sub-models as a field in the model" approach,
as well as consider the large-scale case with possibly the "dump to file" option.
    
    In addition, we could have an option to keep "best", "all" or "_k_" models (user-specified
as a number or %)?
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message