spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiangrui Meng (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-7535) Audit Pipeline APIs for 1.4
Date Fri, 22 May 2015 01:22:17 GMT

    [ https://issues.apache.org/jira/browse/SPARK-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14554629#comment-14554629
] 

Xiangrui Meng edited comment on SPARK-7535 at 5/22/15 1:21 AM:
---------------------------------------------------------------

Some notes:

1. Estimator/Transformer/ doesn’t need to extend Params since PipelineStage already does.
2. @varargs to setDefault (SPARK-7498)
3. Move Evaluator to ml.evaluation.
4. Mention larger metrics are better.
5. PipelineModel doc. “compiled” -> “fitted”
6. Remove Params.validateParams(paramMap)?
7. UnresolvedAttribute (Java compatibility?)
8. Missing RegressionEvaluator (SPARK-7404)
9. ml.feature missing package doc (SPARK-7808)
10. param and getParam should be final (SPARK-7816)
11. Hide PolynomialExpansion.expand
12. Update RegexTokenizer default setting. (SPARK-7794)
13. Mention `RegexTokenizer` in `Tokenizer`. (SPARK-7794)
14. Hide VectorAssembler.
15. Word2Vec.minCount -> @param
16. ParamValidators -> DeveloperApi
17. Params -> @DeveloperApi
18. ALS -> use dataframes to store user/item factors? Then we can hide ALS.Rating
19. ALSModel -> remove training parameters?
20. Hide MetadataUtils/SchemaUtils.


was (Author: mengxr):
Some notes:

1. Estimator/Transformer/ doesn’t need to extend Params since PipelineStage already does.
2. @varargs to setDefault (SPARK-7498)
3. Move Evaluator to ml.evaluation.
4. Mention larger metrics are better.
5. PipelineModel doc. “compiled” -> “fitted”
6. Remove Params.validateParams(paramMap)?
7. UnresolvedAttribute (Java compatibility?)
8. Missing RegressionEvaluator (SPARK-7404)
9. ml.feature missing package doc (SPARK-7808)
10. param and getParam should be final
11. Hide PolynomialExpansion.expand
12. Update RegexTokenizer default setting. (SPARK-7794)
13. Mention `RegexTokenizer` in `Tokenizer`. (SPARK-7794)
14. Hide VectorAssembler.
15. Word2Vec.minCount -> @param
16. ParamValidators -> DeveloperApi
17. Params -> @DeveloperApi
18. ALS -> use dataframes to store user/item factors? Then we can hide ALS.Rating
19. ALSModel -> remove training parameters?
20. Hide MetadataUtils/SchemaUtils.

> Audit Pipeline APIs for 1.4
> ---------------------------
>
>                 Key: SPARK-7535
>                 URL: https://issues.apache.org/jira/browse/SPARK-7535
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML, PySpark
>            Reporter: Joseph K. Bradley
>            Assignee: Xiangrui Meng
>
> This is an umbrella for auditing the Pipeline (spark.ml) APIs.  Items to check:
> * Public/protected/private access
> * Consistency across spark.ml
> * Classes, methods, and parameters in spark.mllib but missing in spark.ml
> ** We should create JIRAs for each of these (under an umbrella) as to-do items for future
releases.
> For each algorithm or API component, create a subtask under this umbrella.  Some major
new items:
> * new feature transformers
> * tree models
> * elastic-net
> * ML attributes
> * developer APIs (Predictor, Classifier, Regressor)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message