incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tommaso Teofili <tommaso.teof...@gmail.com>
Subject Re: [DISCUSS] Hama 0.5 roadmap
Date Tue, 14 Feb 2012 20:06:51 GMT
2012/2/14 Thomas Jungblut <thomas.jungblut@googlemail.com>

> >
> > Maybe 2~3 months later?
>
>
> I would love that schedule, but I don't think we are going to handle this
> timelimit with our current throughput.
> Lin (may I call you like that?:D) has to add more detailed descriptions to
> the tasks so that we can also work on them.
> So realistically we can make it in 5-6 months, our regular release
> schedule.
> I know there is a business behind, but it doesn't help us to hurry.
>
> HAMA-511 should not be a blocker for 0.5 release, it should be considered
> > as a long term task I think.
>
>
> +1.
>
> We have to stabilize ourselves first rather than finding ways to
> > differentiate ourselves from the competition or considering new
> paradigms.
>
>
> To be honest, the whole graph domain is conquered by giraph. And it is
> perfectly fine, because they are focused on it.
> Anyways, we have to push Hama into another direction. We can support graph
> processing, but our great success should be in iterative algorithms which
> can easily implemented with BSP.
> The first is to make my K-Means "tasteful" to Mahout, they are a great
> driver, especially for researches.
> The second idea is to support this dryad functionality, there is no
> framework which has this ability out there and since Hortonworks is
> supporting Microsoft, I think we can get some new people for Hama.
> And the third one is to improve the real-time processing. This will be
> greatly driven by the second idea, however we have to add some more simpler
> API for these task. This must be evaluated then (lets say in 0.6.0).
>
> speculative task execution
>
>
> Sorry, I seem to have not answered your question at all in the mail you've
> linked.
> It is a cool feature, but I guess this should come along with
> fault-tolerance, e.G. if we detect that a task is longer running than the
> other.
>
> A future target for Hama is a distributed cache like in BSPLib where you
> can get and put objects.
> I am having an eye on Apache Direct Memory, however they are in early stage
> of incubation, so this may take a bit of time.
>
> Everything else has been targeted so far.
>
> What about graduation?
> In my opinion we have stabilized so far with our community, I expect two
> new comitters soon, a third one also seem to get on its way for
> contribution.
> The other tasks seems to be ticked off as well.
>

As a mentor I'd be +1 on start discussing about graduation as soon as we
have new committers on board.
The above plan sounds perfect so once it becomes reality I think we can go
on and star the discussion.
Tommaso



>
> 2012/2/14 Edward J. Yoon <edwardyoon@apache.org>
>
> > Are you looking for this link?
> > http://wiki.apache.org/hama/GroomServerFaultTolerance
> >
> > >> There are many tasks required to work on and to be integrated in order
> > >> to get (GroomServer) fault tolerance ready. Tasks include:
> > >> - GroomServer status/ resource monitor
> > >> - Failure Detection
> > >> - Checkpointed data integration
> > >> - Refactoring bsp() (if necessary)
> > >> - Master decision making
> >
> > Hmm, yes. and I missed message compressor.
> >
> > Could you please split them into more smaller task so that we can help
> you?
> >
> > > I also would like to know why we rejected the idea of speculative task
> > > execution?
> >
> > I wanted to talk about speculative task execution before but, the idea
> > of speculative task execution is not discussed/reported yet. (
> > http://markmail.org/thread/sq7neayhstqufrsz )
> >
> > To support this, we should add 'Progress' feature first. Currently,
> > job/task progress checker is not implemented yet.
> >
> > > How serious is the feature of real-time processing for Hama? I am told
> > that
> > > some are already using it for the purpose and read Thomas's blog on the
> > > same. Are we deferring it until we have a design for offline processing
> > or
> > > should we keep it in mind for fault tolerance?
> >
> > I think, yes if possible. But in some cases, maybe turning off
> > recovery mode is the best.
> >
> > I don't understand perfectly yet, so would you please describe the
> > issues which must be discussed/considered?
> >
> > On Tue, Feb 14, 2012 at 3:15 AM, Suraj Menon <menonsuraj5@gmail.com>
> > wrote:
> > > +1 on HAMA 511 should not be blocker.
> > >
> > > Also, I lost the wiki link that explains the fault tolerant design. It
> > > would be helpful to undestand the recovery design. I believe that we
> will
> > > have the recovery BSP tasks scheduled to start running(in high
> > probability)
> > > on node with data where the checkpointed messages are written on HDFS
> > with
> > > a single input split?
> > > I also would like to know why we rejected the idea of speculative task
> > > execution?
> > > I am currently working on HAMA-445 and HAMA-498. Thanks to Chiahung, I
> > have
> > > 2-3 good papers to read already :).
> > >
> > > How serious is the feature of real-time processing for Hama? I am told
> > that
> > > some are already using it for the purpose and read Thomas's blog on the
> > > same. Are we deferring it until we have a design for offline processing
> > or
> > > should we keep it in mind for fault tolerance?
> > >
> > >
> > > Thanks,
> > > Suraj
> > >
> > >
> > >
> > > On Mon, Feb 13, 2012 at 12:25 PM, Chia-Hung Lin <clin4j@googlemail.com
> > >wrote:
> > >
> > >> There are many tasks required to work on and to be integrated in order
> > >> to get (GroomServer) fault tolerance ready. Tasks include:
> > >> - GroomServer status/ resource monitor
> > >> - Failure Detection
> > >> - Checkpointed data integration
> > >> - Refactoring bsp() (if necessary)
> > >> - Master decision making
> > >>
> > >> Currently I am working on the first one, and with a patch for 2nd on
> > >> jira already. In my viewpoint, it might be difficult to get those
> > >> tasks done within 2-3 months.
> > >>
> > >> On 13 February 2012 17:05, Edward J. Yoon <edwardyoon@apache.org>
> > wrote:
> > >> > Hi,
> > >> >
> > >> > I think, it's time to discuss about our 0.5 roadmap more clearly.
> > >> >
> > >> > IMO, I'd like to release Hama 0.5 with only fault tolerant
> processing,
> > >> > clearly defined BSP and Pregel interfaces. Maybe 2~3 months later?
> > >> > And, HAMA-511 should not be a blocker for 0.5 release, it should be
> > >> > considered as a long term task I think.
> > >> >
> > >> > There's a lot of new M/R alternatives but no stable alternatives and
> > >> > no dominant player at the moment. We have to stabilize ourselves
> first
> > >> > rather than finding ways to differentiate ourselves from the
> > >> > competition or considering new paradigms.
> > >> >
> > >> > Please feel free to leave your opinion!
> > >> >
> > >> > --
> > >> > Best Regards, Edward J. Yoon
> > >> > @eddieyoon
> > >>
> >
> >
> >
> > --
> > Best Regards, Edward J. Yoon
> > @eddieyoon
> >
>
>
>
> --
> Thomas Jungblut
> Berlin <thomas.jungblut@gmail.com>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message