reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Meleshko <andr...@microsoft.com>
Subject RE: IMRU initialization with train data
Date Wed, 06 Jul 2016 17:16:25 GMT
Created 1484 to track this. Also listed some of the topics you answered offline.

Thank you
> -----Original Message-----
> From: Dhruv Mahajan [mailto:dhruv.mahajan@gmail.com]
> Sent: Wednesday, July 6, 2016 10:01 AM
> To: dev@reef.apache.org
> Subject: Re: IMRU initialization with train data
> 
> Ohh definitely....
> 
> On Wed, Jul 6, 2016 at 8:52 AM, Andrey Meleshko
> <andreym@microsoft.com>
> wrote:
> 
> > Well, I am new to the code and don't have experience with map reduce
> > implementations....so lots of reverse engineering for me.
> > Although examples could be improved, I am wondering if small wiki on
> > IMRU collaboration/dataflow is in order?
> > Deserves a jira?
> >
> > Thank you,
> > Andrey
> > > -----Original Message-----
> > > From: Dhruv Mahajan [mailto:dhruv.mahajan@gmail.com]
> > > Sent: Tuesday, July 5, 2016 9:15 PM
> > > To: dev@reef.apache.org
> > > Subject: Re: IMRU initialization with train data
> > >
> > > Hi Andrey
> > >
> > > You are right and should rightly blame it on me. We can improve both
> > > examples to use some data from partition. For example, broadcast
> > > example can take the value to boradcast/reduce from partitions while
> > > mapper count examples can actually compute 1+2+3+.... numEvaluators,
> > > where each evaluator is passes its number by partition?
> > >
> > > Dhruv
> > >
> > > On Tue, Jul 5, 2016 at 12:55 PM, Andrey Meleshko
> > > <andreym@microsoft.com>
> > > wrote:
> > >
> > > > Yes, that's the example.
> > > > Understood, so the constructor of map function to determines if
> > > > the initial data is used.
> > > > So MapFunction's responsibility is to merge that data with
> > > > mapInput during each iterations when necessary.
> > > >
> > > > PS: There is another example of MapFunction: IdentityMapFunction,
> > > > which does take partition data in constructor, but doesn't use it.
> > > > Another example to improve maybe.
> > > >
> > > > Thank you
> > > > /Andrey
> > > >
> > > > > -----Original Message-----
> > > > > From: Markus Weimer [mailto:markus@weimo.de]
> > > > > Sent: Tuesday, July 5, 2016 12:43 PM
> > > > > To: dev@reef.apache.org
> > > > > Subject: Re: IMRU initialization with train data
> > > > >
> > > > > On 2016-07-05 12:04 PM, Andrey Meleshko wrote:
> > > > > > The example does configure RandomInputDataset, which does
> > > > > > create random data (2 doubles per partition by default in
> > > > > > RundomInputpartition class).
> > > > >
> > > > > Just because it got me confused: I assume we are talking about
> > > > > this
> > > > example,
> > > > > right?
> > > > >
> > > > >
> > >
> `Org.Apache.REEF.IMRU.Examples.PipelinedBroadcastReduce.PipelinedBro
> > > > > a
> > > > > dcastAndReduce`
> > > > >
> > > > >
> > > > > In that example, the `BroadcastReceiverReduceSenderMapFunction`
> > > > > is used as the `IMapFunction`. That class does not expect any
> > > > > data in its
> > > > constructor.
> > > > > Hence, the `RandomInputPartition` isn't actually instantiated.
> > > > > It would
> > > > also
> > > > > not be instantiated on REEF.
> > > > >
> > > > > To improve the example, the map function should indeed depend on
> > > > > the data configured.
> > > > >
> > > > > Markus
> > > >
> >
Mime
View raw message