hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Patterson <j...@cloudera.com>
Subject Re: Digital Signal Processing Library + Hadoop
Date Tue, 08 Mar 2011 14:24:12 GMT
Roger,
A basic time series construct is the "sliding" window in conjunction
with sorted time/value data; A sample implementation is at my github:

https://github.com/jpatanooga/Caduceus/tree/master/src/tv/floe/caduceus/hadoop/movingaverage

There are two jobs in there, one that uses the shuffle and one that
does not --- to illustrate the difference. I have a blog draft coming
that accompanies this code, I'll follow up and send you a copy draft
of it.

>From that code you should be able to build out a more complex time
series / DSP process (using it as base code), something along the
lines of a 1NN classifier:

https://openpdc.svn.codeplex.com/svn/Hadoop/Current%20Version/
https://openpdc.svn.codeplex.com/svn/Hadoop/Current%20Version/docs/openPDC%20Datamining%20Tools%20Guide.pdf
https://openpdc.svn.codeplex.com/svn/Hadoop/Current%20Version/src/TVA/Hadoop/MapReduce/Datamining/SAX/SlidingTSClassifier_kNN.java

I'm in the process of updating that older openPDC code to be more
modern and modular for general data sources.

Josh




On Sat, Mar 5, 2011 at 12:05 AM, Roger Smith <rogersmith1711@gmail.com> wrote:
> All -
> I wonder if any of you have integrated a DSP library with Hadoop.
> We are considering using Hadoop to processing time series data, but don't
> want to write standard DSP functions.
>
> Roger.
>



-- 
Twitter: @jpatanooga
Solution Architect @ Cloudera
hadoop: http://www.cloudera.com
blog: http://jpatterson.floe.tv

Mime
View raw message