hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Jungblut <thomas.jungb...@googlemail.com>
Subject Re: Online machine learning on top of Hama BSP
Date Thu, 14 Jun 2012 18:45:36 GMT
I have read a bit about batch neural networks and I think I have found a
viable solution for BSP.
The funny thing is, that it is the same intuition that my kmeans clustering
has.

Each task is processing on a local block of the data, training a full model
for itself (making a forward pass and calculating the error of the output
neurons against the prediction).
Now after you have iterated over all the observations, you are going to
send all the weights of your neurons and the error (let's say the average
error over all observations) to all the other tasks.
After sync, each tasks has #tasks weights for a neuron and the avg
prediction error, now the weights are accumulated and the backward step
with the error begins.
When all weights are backpropagated on each task, you can start reading the
whole observations again and make the next epoch. (until some minimum
average error has been seen or maximum epochs has been reached).

Don't know if that is a common pattern in machine learning, but seems to me
like we can extract some kind of API that helps building local models and
combining them again in the next superstep with more information (think of
the Pregel API with compute, but not on vertex level but on task level).

What do you think about that?

2012/6/14 Thomas Jungblut <thomas.jungblut@googlemail.com>

> Very cool project, I just need a few vectors and matrices where I will use
> my own library first.
>
> Still having a hard time to distribute the network and update it
> accordingly in backprop. If you have smart ideas, let me know.
>
>
> 2012/6/14 Tommaso Teofili <tommaso.teofili@gmail.com>
>
>> Hi Thomas,
>> regarding neural networks I'm also working on it within Apache Yay (my
>> Apache labs project [1]) and I agree it'd make sense to run neural network
>> algorithms on top of Hama, however at this stage I've just a prototype in
>> memory implementation for feedforward (no actual learning) neural
>> networks.
>> Apart from that I think we need a math/linear algebra package running on
>> top of Hama to make those algorithms scale nicely.
>> I agree we can start from batch and then switch to online machine learning
>> algorithms.
>> Regards,
>> Tommaso
>>
>> [1] : http://svn.apache.org/repos/asf/labs/yay/trunk/
>>
>> 2012/6/13 Thomas Jungblut <thomas.jungblut@googlemail.com>
>>
>> > I'm going to focus still on batch learning, my next aim would be to try
>> out
>> > neural networks with BSP.
>> >
>> >
>> >
>> http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=685414&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2FFabs_all.jsp%2FFarnumber%2F685414
>> >
>> > http://techreports.cs.queensu.ca/files/1997-406.pdf
>> >
>> > Along with the pSVM we have then two strong learners. If you're
>> interested,
>> > pass me a private message. But I have to write a few exams next week so
>> I'm
>> > busy and this is just an idea, we'll see how fast I can get a prototye.
>> >
>> > Real time is difficult at the moment, we need the out of sync messaging.
>> >
>> > 2012/6/13 Edward J. Yoon <edwardyoon@apache.org>
>> >
>> > > Thank you for your sharing!
>> > >
>> > > On Wed, Jun 13, 2012 at 7:03 PM, Tommaso Teofili
>> > > <tommaso.teofili@gmail.com> wrote:
>> > > > following up with this discussion on our dev list, I found an
>> > > introductory
>> > > > pdf to online ML which may be useful [1]
>> > > > Apart fromt that we can start by creating the module structure in
>> hama
>> > > svn
>> > > > (still the incubator one as the TLP move seems to take a while).
>> > > > Regards,
>> > > > Tommaso
>> > > >
>> > > > [1] :
>> > http://www.springerlink.com/content/m480047m572t6262/fulltext.pdf
>> > > >
>> > > > 2012/5/25 Edward J. Yoon <edwardyoon@apache.org>
>> > > >
>> > > >> I'm roughly thinking to create new module so that I can add 3rd
>> party
>> > > >> dependencies easily.
>> > > >>
>> > > >> On Fri, May 25, 2012 at 4:36 PM, Tommaso Teofili
>> > > >> <tommaso.teofili@gmail.com> wrote:
>> > > >> > Do you have a plan for that Edward?
>> > > >> > A separate package in examples or a separate (online) machine
>> > learning
>> > > >> > module? Or something else?
>> > > >> > Regards
>> > > >> > Tommaso
>> > > >> >
>> > > >> > 2012/5/25 Edward J. Yoon <edwardyoon@apache.org>
>> > > >> >
>> > > >> >> OKay, then let's get started.
>> > > >> >>
>> > > >> >> My first idea is simple online recommendation system
based on
>> > > >> click-stream
>> > > >> >> data.
>> > > >> >>
>> > > >> >> On Thu, May 24, 2012 at 6:26 PM, Praveen Sripati
>> > > >> >> <praveensripati@gmail.com> wrote:
>> > > >> >> > +1
>> > > >> >> >
>> > > >> >> > For those who are interested in ML, please check
this. GNU
>> Octave
>> > > is
>> > > >> >> used.
>> > > >> >> >
>> > > >> >> > https://www.coursera.org/course/ml
>> > > >> >> >
>> > > >> >> > Another session is yet to be announced.
>> > > >> >> >
>> > > >> >> > Thanks,
>> > > >> >> > Praveen
>> > > >> >> >
>> > > >> >> > On Thu, May 24, 2012 at 12:54 PM, Thomas Jungblut
<
>> > > >> >> > thomas.jungblut@googlemail.com> wrote:
>> > > >> >> >
>> > > >> >> >> +1
>> > > >> >> >>
>> > > >> >> >> 2012/5/24 Tommaso Teofili <tommaso.teofili@gmail.com>
>> > > >> >> >>
>> > > >> >> >> > and same here :)
>> > > >> >> >> >
>> > > >> >> >> > 2012/5/24 Vaijanath Rao <vaiju1981@gmail.com>
>> > > >> >> >> >
>> > > >> >> >> > > +1 me too
>> > > >> >> >> > > On May 23, 2012 10:26 PM, "Aditya
Sarawgi" <
>> > > >> >> sarawgi.aditya@gmail.com>
>> > > >> >> >> > > wrote:
>> > > >> >> >> > >
>> > > >> >> >> > > > +1
>> > > >> >> >> > > > I would be happy to help :)
>> > > >> >> >> > > >
>> > > >> >> >> > > > On Wed, May 23, 2012 at 6:23
PM, Edward J. Yoon <
>> > > >> >> >> edwardyoon@apache.org
>> > > >> >> >> > > > >wrote:
>> > > >> >> >> > > >
>> > > >> >> >> > > > > Hi,
>> > > >> >> >> > > > >
>> > > >> >> >> > > > > Does anyone interesting
in online machine learning?
>> > > >> >> >> > > > >
>> > > >> >> >> > > > > --
>> > > >> >> >> > > > > Best Regards, Edward J.
Yoon
>> > > >> >> >> > > > > @eddieyoon
>> > > >> >> >> > > > >
>> > > >> >> >> > > >
>> > > >> >> >> > > >
>> > > >> >> >> > > >
>> > > >> >> >> > > > --
>> > > >> >> >> > > > Cheers,
>> > > >> >> >> > > > Aditya Sarawgi
>> > > >> >> >> > > >
>> > > >> >> >> > >
>> > > >> >> >> >
>> > > >> >> >>
>> > > >> >> >>
>> > > >> >> >>
>> > > >> >> >> --
>> > > >> >> >> Thomas Jungblut
>> > > >> >> >> Berlin <thomas.jungblut@gmail.com>
>> > > >> >> >>
>> > > >> >>
>> > > >> >>
>> > > >> >>
>> > > >> >> --
>> > > >> >> Best Regards, Edward J. Yoon
>> > > >> >> @eddieyoon
>> > > >> >>
>> > > >>
>> > > >>
>> > > >>
>> > > >> --
>> > > >> Best Regards, Edward J. Yoon
>> > > >> @eddieyoon
>> > > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Best Regards, Edward J. Yoon
>> > > @eddieyoon
>> > >
>> >
>> >
>> >
>> > --
>> > Thomas Jungblut
>> > Berlin <thomas.jungblut@gmail.com>
>> >
>>
>
>
>
> --
> Thomas Jungblut
> Berlin <thomas.jungblut@gmail.com>
>



-- 
Thomas Jungblut
Berlin <thomas.jungblut@gmail.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message