commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Nguyen (JIRA)" <>
Subject [jira] [Commented] (STATISTICS-8) Implementation of regression libraries within common-statistics framework
Date Thu, 04 Apr 2019 07:34:00 GMT


Ben Nguyen commented on STATISTICS-8:


I have just submitted my draft proposal named "New Regression Library Architecture Implementation
using Stream-based Java Statistical Processing"

I thank you for any feedback you can give me in advance.


> Implementation of regression libraries within common-statistics framework
> -------------------------------------------------------------------------
>                 Key: STATISTICS-8
>                 URL:
>             Project: Apache Commons Statistics
>          Issue Type: Task
>            Reporter: Eric Barnhill
>            Priority: Major
> Apache commons is one of the most widely used resources by Java programmers around the
world. Data related applications are soaring and Java is one of the most commonly used languages
for data engineering. Consequently the commons-statistics library, currently under development,
is likely to find a widespread audience.
> For this project we aim to implement regression methods, arguably the most widely used
techniques in statistics and machine learning, within the Apache commons framework, in particular
within the new commons-statistics library.
> The assignee will:
>  * Use core functionality from the regression sub-libraries of the deprecated commons-math
4 framework as a starting point
>  * Create a new, standalone commons component for regression statistics, focusing first
on linear and logistic regression
>  * Make architectural and design decisions in the commons philosophy, that is, lightweight
standalone components easy to understand and use by a wide range of Java developers (i.e.
not a large, omnibus mathematical library with many degrees of abstraction)
>  * Draw inspiration from widely used libraries in scikit-learn and R to design an up-to-date
statistics package
>  * Design unit testing and documentation for these libraries
> Particularly challenging design decisions include how to incorporate core matrix libraries
with a minimum of dependencies and redundancies.

This message was sent by Atlassian JIRA

View raw message