spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xusen Yin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-14302) Python examples code merge and clean up
Date Sun, 01 May 2016 23:47:12 GMT

    [ https://issues.apache.org/jira/browse/SPARK-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266006#comment-15266006
] 

Xusen Yin commented on SPARK-14302:
-----------------------------------

[~kanjilal] Thanks for working on this. However, I check the duplicated examples again and
find out that we should not delete all of them. As I depicted below:

* python/ml
** None

* Unsure duplications, double check
** dataframe_example.py  --> serves for an example of dataframe usage.
** kmeans_example.py  --> serves as an application
** simple_params_example.py  --> serves for an example of params usage.
** simple_text_classification_pipeline.py  --> serves as an application.

* python/mllib
** gaussian_mixture_model.py  --> serves as an application.
** kmeans.py  --> ditto
** logistic_regression.py  --> ditto

* Unsure duplications, double check
** correlations.py  --> ditto
** random_rdd_generation.py  --> ditto
** sampled_rdds.py  --> ditto
** word2vec.py  --> ditto

So I think we can close this JIRA as won't fix. What do you think about it?

> Python examples code merge and clean up
> ---------------------------------------
>
>                 Key: SPARK-14302
>                 URL: https://issues.apache.org/jira/browse/SPARK-14302
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Examples
>            Reporter: Xusen Yin
>            Priority: Minor
>              Labels: starter
>
> Duplicated code that I found in python/examples/mllib and python/examples/ml:
> * python/ml
> ** None
> * Unsure duplications, double check
> ** dataframe_example.py
> ** kmeans_example.py
> ** simple_params_example.py
> ** simple_text_classification_pipeline.py
> * python/mllib
> ** gaussian_mixture_model.py
> ** kmeans.py
> ** logistic_regression.py
> * Unsure duplications, double check
> ** correlations.py
> ** random_rdd_generation.py
> ** sampled_rdds.py
> ** word2vec.py
> When merging and cleaning those code, be sure not disturb the previous example on and
off blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message