hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Jungblut <thomas.jungb...@googlemail.com>
Subject Re: Reset Input RecordReader
Date Tue, 29 Nov 2011 08:07:49 GMT
Oh yes, there was the little problem ;) Thanks for reminding.
Your "fix" would be to let the user implement several chained computation
methods?

2011/11/29 ChiaHung Lin <chl501@nuk.edu.tw>

> Slightly disagree with easy recoverable part. Considering the following
> code snippet
>
> bsp() {
> var i,j,k;
> compute1()
> sync()
> compute2()
> sync()
>  for(...) {
>    computex(i, k)
>    sync()
>    computey(j)
>    sync()
>  }// for
> }
>
> Suppose it has 43 supersteps. And it has checkpointed data at the 23th
> superstep, then bsp task crashes. So steps to recover may include 1.)
> analyze source to ensure the number of sync() reaching to the superstep
> 23th. 2.) main thread need to find a way going to that function and feeding
> the checkpoint data and maybe also ensure it does not violate some
> atomicity with variables some where else.
>
> The reason why I think it might be easier for recovery with a bit fine
> grained unit is because we can achieve by feeding checkpointed messages
> back to a superstep directly (as below). (Of course this is not the only
> way, we can discuss and probably find out a better solution)
>
> // in framework
> Superstep step ...;
> if(recovered) {
>  step = supersteps.get(22)
>  step.recover(checkpointedData)
> }
> ...
>
>
> superstep() {
>  if(recovered) {
>    ... getCheckpointedMessage()
>    // do something
>  }
> }
>
> For sync(), it is not necessary to separate sync() from superstep, so we
> can have functions allowing users to specify e.g. syncBefore(),
> syncAfter(), etc. when a superstep is called.
>
>
> -----Original message-----
> From:Thomas Jungblut <thomas.jungblut@googlemail.com>
> To:hama-dev@incubator.apache.org
> Date:Tue, 29 Nov 2011 07:24:45 +0100
> Subject:Re: Reset Input RecordReader
>
> Yep, it is just a reopen. Let's call it like this. I'm going to make up a
> patch later.
> Therefore it is just the read of the same assigned split. So no problem ;)
>
> Yes BSP is not atomic, but as long as the user sticks with the
> communication and the stuff from IO (not using fields in a hashmap like
> pagerank or so) this is always easy recoverable.
> But you cannot express every algorithm with just one sync at the end of a
> function, so BSP() must be somewhere anyways.
> For me it is a question of algorithm design, as long as you use major parts
> from our framework, this is fail safe.
>
>
> 2011/11/29 ChiaHung Lin <chl501@nuk.edu.tw>
>
> > Do it mean for each iteration the computation (code within bsp function)
> > requires to read the same or different input?
> >
> > I have this questions is because it seems to me having related to what
> > previously I mentioned regarding to the rework of bsp function
> (providing a
> > smaller computation unit e.g. superstep).
> >
> > bsp(...) {
> > sync()
> > // superstep 1
> > // read from hdfs
> > // compute1()
> > // send messages ...
> > sync()
> > // superstep 2
> > // read from/ write pvfs
> > // compute2()
> > sync()
> > // superstep 3
> > // write to cassandra
> > // compute3()
> > sync()
> > ...
> > }
> >
> > The reason is because within bsp() it consists of several supersteps. And
> > for each iteration, users probably want to read from/ write to different
> > input/ output. This is a pattern. Although current bsp() is flexible
> > allowing users to write whatever they want within bsp(), the
> disadvantage I
> > observe include 1.) difficult for recovery 2.) many code mixed up
> together
> > within one function.
> >
> > The first one may be overcome by source code instrumentation but that is
> > not a good solution because users do not know what/ where goes wrong when
> > bsp() doesn't function well.
> >
> > The second one is a bit minor, and can be e.g. reorganized in a more
> > modular way. But this looks similar to the way if we provide e.g
> > superstep().
> >
> > Just some thoughts.
> >
> > -----Original message-----
> > From:Thomas Jungblut <thomas.jungblut@googlemail.com>
> > To:hama-dev@incubator.apache.org
> > Date:Tue, 29 Nov 2011 04:39:38 +0100
> > Subject:Reset Input RecordReader
> >
> > Hi all,
> >
> > I need some kind of reset-logic for the input of a BSP Job.
> > It should be quite easy to add:
> > - add a method called resetInput() in BSPPeer
> > - in concrete implementation it just closes the input split and opens it
> > again
> >
> > If you're interested why I need this, I'm currently writing a k-means
> > clustering in BSP.
> > I need to iterate over all vectors from the input and measure distance
> > against a set of centers in each superstep, so it would help me to
> "reset"
> > the input.
> >
> > Do you think I can add this right away into the trunk?
> >
> > --
> > Thomas Jungblut
> > Berlin <thomas.jungblut@gmail.com>
> >
> >
> > --
> > ChiaHung Lin
> > Department of Information Management
> > National University of Kaohsiung
> > Taiwan
> >
>
>
>
> --
> Thomas Jungblut
> Berlin <thomas.jungblut@gmail.com>
>
>
> --
> ChiaHung Lin
> Department of Information Management
> National University of Kaohsiung
> Taiwan
>



-- 
Thomas Jungblut
Berlin <thomas.jungblut@gmail.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message