spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiangrui Meng (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-7753) Improve kernel density API
Date Wed, 20 May 2015 07:45:59 GMT
Xiangrui Meng created SPARK-7753:
------------------------------------

             Summary: Improve kernel density API
                 Key: SPARK-7753
                 URL: https://issues.apache.org/jira/browse/SPARK-7753
             Project: Spark
          Issue Type: Sub-task
          Components: MLlib
    Affects Versions: 1.4.0
            Reporter: Xiangrui Meng
            Assignee: Xiangrui Meng


Kernel density estimation is provided in many statistics libraries: http://en.wikipedia.org/wiki/Kernel_density_estimation#Statistical_implementation.
We should make sure that we implement a similar API. The two most important parameters of
kernel density estimation are kernel type and bandwidth. Besides density estimation, it is
also used for smoothing. The current API is designed only for Gaussian kernel and density
estimation:

{code}
def kernelDensity(samples: RDD[Double], standardDeviation: Double, evaluationPoints: Iterable[Double]):
Array[Double]
{code}

It would be nice if we can come up with an extensible API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message