Mailing-List: contact hama-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hama-user@incubator.apache.org
Received-SPF: pass (athena.apache.org: domain of
 thomas.jungblut@googlemail.com designates 209.85.212.47 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAKi8Xk3XEZFHpjkkcBk-0wO5wwRVS5mftM6fYEmVs+jHNqgAJg@mail.gmail.com>
References: 
 <CAKi8Xk0mE_BgyX3NNNUhLB-=LB=6VO1-ioG0sU8jM2VeeZ7haw@mail.gmail.com>
	<CAJ-=ysmc4p+Hb1XCCGXT-Aok3eN9sH+8MP1FNhfq=RPZNV7ypw@mail.gmail.com>
	<CAKi8Xk3XEZFHpjkkcBk-0wO5wwRVS5mftM6fYEmVs+jHNqgAJg@mail.gmail.com>
Date: Fri, 2 Sep 2011 17:07:51 +0200
Message-ID: 
 <CAJ-=ys=PuU4r1UhCrYHTi2u=WF3C1aV6ruosf8ArM4+GOjc4JA@mail.gmail.com>
Subject: Re: About SVM Stochastic Gradient Descent (SGD) on BSP model
From: Thomas Jungblut <thomas.jungblut@googlemail.com>
To: hama-user@incubator.apache.org
Content-Type: multipart/alternative; boundary=bcaec51b1659a7aea804abf6b9cc

--bcaec51b1659a7aea804abf6b9cc
Content-Type: text/plain; charset=ISO-8859-1

Hi Joe,

great thank you very much for clarification.
I love classification algorithms, so I'll be very interested in how you
develop this.
"Per se" you can translate every MapReduce algorithm to BSP, since BSP is an
abstraction to MapReduce.
E.G: Map Phase is a local computation phase, merging and sorting are the
synchronization barrier (needs finish of all map tasks) and reducing is a
computational phase again.
On the english wikipedia is a good schema that shows how the workflow is.

Actually you can make your map phase as well in BSP, but for the latest
release 0.3.0 you have to write data sharding and partitioning for yourself.
There are examples and blog posts that shows how code them.
Your reduce step is depending on your implementation. Is there a single
reducer which updates the whole classifier?

Actually I wanted to implement a k-means clustering in BSP, but sadly I was
very busy and have not too much time for it. It is quite similar to your
algorithm. The Map step is calculating the distance between the current
point and the centers and the reducer is going to update the centers.

To provide you with a bit of information, I already rewritten an MapReduce
graph algorithm to BSP. [1][2]
These examples are without partitioning, I recently did an improvement to
the partitioning algorithm. So it makes sense to checkout the current trunk
and browse through the graph package and examples package. It contains
improved partitioning as well as graph examples.

HF and GL.
If you need help, I'll be glad to help you.

[1]
http://codingwiththomas.blogspot.com/2011/04/graph-exploration-with-hadoop-mapreduce.html
[2]
http://codingwiththomas.blogspot.com/2011/04/graph-exploration-using-apache-hama-and.html

2011/9/2 Zhiyong Xie <zyxienju@gmail.com>

> Thank you Thomas! In short, SVM (
> http://en.wikipedia.org/wiki/Support_vector_machine) is a supervised
> learning classifier described as optimization problem and solved by
> gradient
> descent approach (http://en.wikipedia.org/wiki/Stochastic_gradient_descent
> ).
> It is a iterative process, and kinda run a map/reduce pair per iteration.
> Map to calculate the gradient value for each point, and reduce phase to
> optimize the classifier. BSP model seems native for scientific and graph
> processing in my mind, not figure out or find much info online for this
> type
> of application so far .
>
> Best,
> Joe
>
> On Thu, Sep 1, 2011 at 10:36 AM, Thomas Jungblut <
> thomas.jungblut@googlemail.com> wrote:
>
> > Hi Joe,
> >
> > for non-insiders, would you please clarify what SGD and SVM are?
> > Then we could give you some tips how to implement them in BSP.
> >
> > Greetz,
> > Thomas
> >
> > 2011/9/1 Zhiyong Xie <zyxienju@gmail.com>
> >
> > > Hi there,
> > >
> > > May I ask whether anyone else have look into the SGD mapping on BSP
> model
> > > too? I'm investigating whether BSP model is a good candidate for
> > > implementing distributed version of SVM SGD.
> > >
> > > Thanks!
> > > Joe
> > > --
> > > Joe (Zhiyong) Xie
> > > Graduate Student
> > >
> >
> >
> >
> > --
> > Thomas Jungblut
> > Berlin
> >
> > mobile: 0170-3081070
> >
> > business: thomas.jungblut@testberichte.de
> > private: thomas.jungblut@gmail.com
> >
>
>
>
> --
> Joe (Zhiyong) Xie
> Graduate Student
>


-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com

--bcaec51b1659a7aea804abf6b9cc--