incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Jungblut <thomas.jungb...@googlemail.com>
Subject Re: [DISCUSS] Hama 0.5 roadmap
Date Tue, 14 Feb 2012 19:55:13 GMT
>
> Maybe 2~3 months later?


I would love that schedule, but I don't think we are going to handle this
timelimit with our current throughput.
Lin (may I call you like that?:D) has to add more detailed descriptions to
the tasks so that we can also work on them.
So realistically we can make it in 5-6 months, our regular release schedule.
I know there is a business behind, but it doesn't help us to hurry.

HAMA-511 should not be a blocker for 0.5 release, it should be considered
> as a long term task I think.


+1.

We have to stabilize ourselves first rather than finding ways to
> differentiate ourselves from the competition or considering new paradigms.


To be honest, the whole graph domain is conquered by giraph. And it is
perfectly fine, because they are focused on it.
Anyways, we have to push Hama into another direction. We can support graph
processing, but our great success should be in iterative algorithms which
can easily implemented with BSP.
The first is to make my K-Means "tasteful" to Mahout, they are a great
driver, especially for researches.
The second idea is to support this dryad functionality, there is no
framework which has this ability out there and since Hortonworks is
supporting Microsoft, I think we can get some new people for Hama.
And the third one is to improve the real-time processing. This will be
greatly driven by the second idea, however we have to add some more simpler
API for these task. This must be evaluated then (lets say in 0.6.0).

speculative task execution


Sorry, I seem to have not answered your question at all in the mail you've
linked.
It is a cool feature, but I guess this should come along with
fault-tolerance, e.G. if we detect that a task is longer running than the
other.

A future target for Hama is a distributed cache like in BSPLib where you
can get and put objects.
I am having an eye on Apache Direct Memory, however they are in early stage
of incubation, so this may take a bit of time.

Everything else has been targeted so far.

What about graduation?
In my opinion we have stabilized so far with our community, I expect two
new comitters soon, a third one also seem to get on its way for
contribution.
The other tasks seems to be ticked off as well.

2012/2/14 Edward J. Yoon <edwardyoon@apache.org>

> Are you looking for this link?
> http://wiki.apache.org/hama/GroomServerFaultTolerance
>
> >> There are many tasks required to work on and to be integrated in order
> >> to get (GroomServer) fault tolerance ready. Tasks include:
> >> - GroomServer status/ resource monitor
> >> - Failure Detection
> >> - Checkpointed data integration
> >> - Refactoring bsp() (if necessary)
> >> - Master decision making
>
> Hmm, yes. and I missed message compressor.
>
> Could you please split them into more smaller task so that we can help you?
>
> > I also would like to know why we rejected the idea of speculative task
> > execution?
>
> I wanted to talk about speculative task execution before but, the idea
> of speculative task execution is not discussed/reported yet. (
> http://markmail.org/thread/sq7neayhstqufrsz )
>
> To support this, we should add 'Progress' feature first. Currently,
> job/task progress checker is not implemented yet.
>
> > How serious is the feature of real-time processing for Hama? I am told
> that
> > some are already using it for the purpose and read Thomas's blog on the
> > same. Are we deferring it until we have a design for offline processing
> or
> > should we keep it in mind for fault tolerance?
>
> I think, yes if possible. But in some cases, maybe turning off
> recovery mode is the best.
>
> I don't understand perfectly yet, so would you please describe the
> issues which must be discussed/considered?
>
> On Tue, Feb 14, 2012 at 3:15 AM, Suraj Menon <menonsuraj5@gmail.com>
> wrote:
> > +1 on HAMA 511 should not be blocker.
> >
> > Also, I lost the wiki link that explains the fault tolerant design. It
> > would be helpful to undestand the recovery design. I believe that we will
> > have the recovery BSP tasks scheduled to start running(in high
> probability)
> > on node with data where the checkpointed messages are written on HDFS
> with
> > a single input split?
> > I also would like to know why we rejected the idea of speculative task
> > execution?
> > I am currently working on HAMA-445 and HAMA-498. Thanks to Chiahung, I
> have
> > 2-3 good papers to read already :).
> >
> > How serious is the feature of real-time processing for Hama? I am told
> that
> > some are already using it for the purpose and read Thomas's blog on the
> > same. Are we deferring it until we have a design for offline processing
> or
> > should we keep it in mind for fault tolerance?
> >
> >
> > Thanks,
> > Suraj
> >
> >
> >
> > On Mon, Feb 13, 2012 at 12:25 PM, Chia-Hung Lin <clin4j@googlemail.com
> >wrote:
> >
> >> There are many tasks required to work on and to be integrated in order
> >> to get (GroomServer) fault tolerance ready. Tasks include:
> >> - GroomServer status/ resource monitor
> >> - Failure Detection
> >> - Checkpointed data integration
> >> - Refactoring bsp() (if necessary)
> >> - Master decision making
> >>
> >> Currently I am working on the first one, and with a patch for 2nd on
> >> jira already. In my viewpoint, it might be difficult to get those
> >> tasks done within 2-3 months.
> >>
> >> On 13 February 2012 17:05, Edward J. Yoon <edwardyoon@apache.org>
> wrote:
> >> > Hi,
> >> >
> >> > I think, it's time to discuss about our 0.5 roadmap more clearly.
> >> >
> >> > IMO, I'd like to release Hama 0.5 with only fault tolerant processing,
> >> > clearly defined BSP and Pregel interfaces. Maybe 2~3 months later?
> >> > And, HAMA-511 should not be a blocker for 0.5 release, it should be
> >> > considered as a long term task I think.
> >> >
> >> > There's a lot of new M/R alternatives but no stable alternatives and
> >> > no dominant player at the moment. We have to stabilize ourselves first
> >> > rather than finding ways to differentiate ourselves from the
> >> > competition or considering new paradigms.
> >> >
> >> > Please feel free to leave your opinion!
> >> >
> >> > --
> >> > Best Regards, Edward J. Yoon
> >> > @eddieyoon
> >>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Thomas Jungblut
Berlin <thomas.jungblut@gmail.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message