commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <phil.ste...@gmail.com>
Subject Re: [math] JSR 247: Data Mining 2.0
Date Mon, 02 Jan 2006 23:53:03 GMT
On 1/2/06, Mark Diggory <mdiggory@apache.org> wrote:
> Phil,
>
> This is a great idea as a specification and standard. We currently have
> a service in our project which does something similar, but its mostly
> implemented in Perl and R.

What project would that be?
>
> I wonder though, how much of it would be implemented at that database
> level vs. in the application. For instance, in doing a transform that
> returned a subset of a dataset from a db, it would much more efficient
> to do it at the db level (in the query) than in the application itself.

The spec being developed is focussed on the analytical / statistical
side rather than OLAP and also aims to be implementation-independent
(i.e., what is really being standardized is the API for vendors to
implement and client apps to use).  That said, your point is valid -
it may be difficult to optimize implementation of some functions when
the db engine can / should do much of the work natively.

> But I like as well the idea of a standalone java based implementation
> too (maybe on HSQLDB) or perhaps theres a direction that could be taken
> with Hibernate as well.
>
As noted above, the functional areas being considered are more
analytical - regression, clustering, classification, feature
extraction, etc.  The overlap with [math] is in the statistical stuff.

Phil

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message