commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebb (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MATH-278) Robust locally weighted regression (Loess / Lowess)
Date Sat, 20 Jun 2009 10:32:07 GMT

    [ https://issues.apache.org/jira/browse/MATH-278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722165#action_12722165
] 

Sebb commented on MATH-278:
---------------------------

SVN keywords are filled in when the code is fetched from the server (and stripped off on upload).

I meant to just remove $Date:$ from the code, leaving:

 * @version $Revision: $

however that can be done when the patch is applied.

> Robust locally weighted regression (Loess / Lowess)
> ---------------------------------------------------
>
>                 Key: MATH-278
>                 URL: https://issues.apache.org/jira/browse/MATH-278
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Eugene Kirpichov
>         Attachments: loess.patch, loess.patch.v2
>
>
> Attached is a patch that implements the robust Loess procedure for smoothing univariate
scatterplots with local linear regression ( http://en.wikipedia.org/wiki/Local_regression)
described by William Cleveland in http://www.math.tau.ac.il/~yekutiel/MA%20seminar/Cleveland%201979.pdf
, with tests.
> (Also, the patch fixes one missing-javadoc checkstyle warning in the AbstractIntegrator
class: I wanted to make it so that the code with my patch does not generate any checkstyle
warnings at all)
> I propose to include the procedure into commons-math because commons-math, as of now,
does not possess a method for robust smoothing of noisy data: there is  interpolation (which
virtually can't be used for noisy data at all) and there's regression, which has quite different
goals. 
> Loess allows one to build a smooth curve with a controllable degree of smoothness that
approximates the overall shape of the data.
> I tried to follow the code requirements as strictly as possible: the tests cover the
code completely, there are no checkstyle warnings, etc. The code is completely written by
myself from scratch, with no borrowings of third-party licensed code.
> The method is pretty computationally intensive (10000 points with a bandwidth of 0.3
and 4 robustness iterations take about 3.7sec on my machine; generally the complexity is O(robustnessIters
* n^2 * bandwidth)), but I don't know how to optimize it further; all implementations that
I have found use exactly the same algorithm as mine for the unidimensional case.
> Some TODOs, in vastly increasing order of complexity:
>  - Make the weight function customizable: according to Cleveland, this is needed in some
exotic cases only, like, where the desired approximation is non-continuous, for example.
>  - Make the degree of the locally fitted polynomial customizable: currently the algorithm
does only a linear local regression; it might be useful to make it also use quadratic regression.
Higher degrees are not worth it, according to Cleveland.
>  - Generalize the algorithm to the multidimensional case: this will require A LOT of
hard work.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message