reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dhruv Mahajan <dhruv.maha...@gmail.com>
Subject Re: IMRU initialization with train data
Date Wed, 06 Jul 2016 17:01:11 GMT
Ohh definitely....

On Wed, Jul 6, 2016 at 8:52 AM, Andrey Meleshko <andreym@microsoft.com>
wrote:

> Well, I am new to the code and don't have experience with map reduce
> implementations....so lots of reverse engineering for me.
> Although examples could be improved, I am wondering if small wiki on IMRU
> collaboration/dataflow is in order?
> Deserves a jira?
>
> Thank you,
> Andrey
> > -----Original Message-----
> > From: Dhruv Mahajan [mailto:dhruv.mahajan@gmail.com]
> > Sent: Tuesday, July 5, 2016 9:15 PM
> > To: dev@reef.apache.org
> > Subject: Re: IMRU initialization with train data
> >
> > Hi Andrey
> >
> > You are right and should rightly blame it on me. We can improve both
> > examples to use some data from partition. For example, broadcast example
> > can take the value to boradcast/reduce from partitions while mapper count
> > examples can actually compute 1+2+3+.... numEvaluators, where each
> > evaluator is passes its number by partition?
> >
> > Dhruv
> >
> > On Tue, Jul 5, 2016 at 12:55 PM, Andrey Meleshko
> > <andreym@microsoft.com>
> > wrote:
> >
> > > Yes, that's the example.
> > > Understood, so the constructor of map function to determines if the
> > > initial data is used.
> > > So MapFunction's responsibility is to merge that data with mapInput
> > > during each iterations when necessary.
> > >
> > > PS: There is another example of MapFunction: IdentityMapFunction,
> > > which does take partition data in constructor, but doesn't use it.
> > > Another example to improve maybe.
> > >
> > > Thank you
> > > /Andrey
> > >
> > > > -----Original Message-----
> > > > From: Markus Weimer [mailto:markus@weimo.de]
> > > > Sent: Tuesday, July 5, 2016 12:43 PM
> > > > To: dev@reef.apache.org
> > > > Subject: Re: IMRU initialization with train data
> > > >
> > > > On 2016-07-05 12:04 PM, Andrey Meleshko wrote:
> > > > > The example does configure RandomInputDataset, which does create
> > > > > random data (2 doubles per partition by default in
> > > > > RundomInputpartition class).
> > > >
> > > > Just because it got me confused: I assume we are talking about this
> > > example,
> > > > right?
> > > >
> > > >
> > `Org.Apache.REEF.IMRU.Examples.PipelinedBroadcastReduce.PipelinedBro
> > > > a
> > > > dcastAndReduce`
> > > >
> > > >
> > > > In that example, the `BroadcastReceiverReduceSenderMapFunction` is
> > > > used as the `IMapFunction`. That class does not expect any data in
> > > > its
> > > constructor.
> > > > Hence, the `RandomInputPartition` isn't actually instantiated. It
> > > > would
> > > also
> > > > not be instantiated on REEF.
> > > >
> > > > To improve the example, the map function should indeed depend on the
> > > > data configured.
> > > >
> > > > Markus
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message