hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Jungblut <thomas.jungb...@googlemail.com>
Subject Re: Online machine learning on top of Hama BSP
Date Fri, 25 May 2012 17:24:33 GMT
Hi Ted,

Giraph offers a graph layer that uses internally BSP on top of MapReduce.
You don't have access to the BSP primitives, therefore you need to treat
every machine learning problem as graph problem which maybe very
inconvenient in many cases.

2012/5/25 Ted Dunning <ted.dunning@gmail.com>

> Apache Giraph probably offers a more mature BSP model of computation.  My
> guess is that it would make a stronger implementation substrate.  It
> certainly has a very strong community.
>
> On Fri, May 25, 2012 at 10:44 AM, Thomas Jungblut <
> thomas.jungblut@googlemail.com> wrote:
>
> > Hi Manuel,
> >
> > 300k is small, I have one with 6 mio clicks.
> > However it is more a question of interest and what algorithms could be
> > suitable for BSP.
> > In case you wonder what BSP is, it stands for bulk synchronous parallel
> > [1].
> > We think that realtime and strongly iterative algorithms that are slow in
> > mapreduce could be more efficiently solved with BSP.
> > If you're interested, let us know.
> >
> > Regards,
> > Thomas
> >
> > [1] http://en.wikipedia.org/wiki/Bulk_synchronous_parallel
> >
> > 2012/5/25 Manuel Blechschmidt <Manuel.Blechschmidt@gmx.de>
> >
> > > Hi Edward,
> > > do you already have a test dataset?
> > >
> > > I might get one with about 300.000 clicks for you.
> > >
> > > It is from www.nelou.com and we are already running a recommender in
> > > preview mode:
> > >
> >
> http://www.nelou.com/artikel-803746/Overall-von-mysuro#__apaxoPreviewMode
> > >
> > > It could be the case that you would have to sign an NDA. Would this be
> > > possible for you?
> > >
> > > /Manuel
> > >
> > > On 25.05.2012, at 10:34, Edward J. Yoon wrote:
> > >
> > > > OKay, I'm FWD this to mahout dev.
> > > >
> > > > I'm planning to create a project related to On-line machine learning,
> > > > as a Apache Hama sub-module. Since the graph of message queues and
> > > > workers could be implemented using BSP (see also [1]). The first idea
> > > > is On-line recommendation system based on click-stream data.
> > > >
> > > > If you have interested in this plan, let's talk together here.
> > > >
> > > > 1.
> > >
> >
> http://codingwiththomas.blogspot.com/2011/10/apache-hama-realtime-processing.html
> > > >
> > > > ---------- Forwarded message ----------
> > > > From: Thomas Jungblut <thomas.jungblut@googlemail.com>
> > > > Date: Fri, May 25, 2012 at 4:55 PM
> > > > Subject: Re: Online machine learning on top of Hama BSP
> > > > To: dev@hama.apache.org
> > > >
> > > >
> > > > Should we cooperate with the Mahout guys on this? I'm pretty sure
> they
> > > > would have fun with it.
> > > > Edward, do you want to ask them?
> > > >
> > > > 2012/5/25 Tommaso Teofili <tommaso.teofili@gmail.com>
> > > >
> > > >> Do you have a plan for that Edward?
> > > >> A separate package in examples or a separate (online) machine
> learning
> > > >> module? Or something else?
> > > >> Regards
> > > >> Tommaso
> > > >>
> > > >> 2012/5/25 Edward J. Yoon <edwardyoon@apache.org>
> > > >>
> > > >>> OKay, then let's get started.
> > > >>>
> > > >>> My first idea is simple online recommendation system based on
> > > >> click-stream
> > > >>> data.
> > > >>>
> > > >>> On Thu, May 24, 2012 at 6:26 PM, Praveen Sripati
> > > >>> <praveensripati@gmail.com> wrote:
> > > >>>> +1
> > > >>>>
> > > >>>> For those who are interested in ML, please check this. GNU
Octave
> is
> > > >>> used.
> > > >>>>
> > > >>>> https://www.coursera.org/course/ml
> > > >>>>
> > > >>>> Another session is yet to be announced.
> > > >>>>
> > > >>>> Thanks,
> > > >>>> Praveen
> > > >>>>
> > > >>>> On Thu, May 24, 2012 at 12:54 PM, Thomas Jungblut <
> > > >>>> thomas.jungblut@googlemail.com> wrote:
> > > >>>>
> > > >>>>> +1
> > > >>>>>
> > > >>>>> 2012/5/24 Tommaso Teofili <tommaso.teofili@gmail.com>
> > > >>>>>
> > > >>>>>> and same here :)
> > > >>>>>>
> > > >>>>>> 2012/5/24 Vaijanath Rao <vaiju1981@gmail.com>
> > > >>>>>>
> > > >>>>>>> +1 me too
> > > >>>>>>> On May 23, 2012 10:26 PM, "Aditya Sarawgi" <
> > > >>> sarawgi.aditya@gmail.com>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> +1
> > > >>>>>>>> I would be happy to help :)
> > > >>>>>>>>
> > > >>>>>>>> On Wed, May 23, 2012 at 6:23 PM, Edward J.
Yoon <
> > > >>>>> edwardyoon@apache.org
> > > >>>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Hi,
> > > >>>>>>>>>
> > > >>>>>>>>> Does anyone interesting in online machine
learning?
> > > >>>>>>>>>
> > > >>>>>>>>> --
> > > >>>>>>>>> Best Regards, Edward J. Yoon
> > > >>>>>>>>> @eddieyoon
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> --
> > > >>>>>>>> Cheers,
> > > >>>>>>>> Aditya Sarawgi
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Thomas Jungblut
> > > >>>>> Berlin <thomas.jungblut@gmail.com>
> > > >>>>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> Best Regards, Edward J. Yoon
> > > >>> @eddieyoon
> > > >>>
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Thomas Jungblut
> > > > Berlin <thomas.jungblut@gmail.com>
> > > >
> > > >
> > > > --
> > > > Best Regards, Edward J. Yoon
> > > > @eddieyoon
> > >
> > > --
> > > Manuel Blechschmidt
> > > Dortustr. 57
> > > 14467 Potsdam
> > > Mobil: 0173/6322621
> > > Twitter: http://twitter.com/Manuel_B
> > >
> > >
> >
> >
> > --
> > Thomas Jungblut
> > Berlin <thomas.jungblut@gmail.com>
> >
>



-- 
Thomas Jungblut
Berlin <thomas.jungblut@gmail.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message