hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris MacKenzie <stu...@chrismackenziephotography.co.uk>
Subject Re: What is the correct way to get a string back from a mapper or reducer
Date Thu, 03 Jul 2014 09:48:21 GMT
Thanks Bertrand,

Thanks very much for getting back to me, I deeply appreciate it. Global
variables lol ;O)

I’m a mature MSc (Software Engineering) student at Heriot Watt in Edinburgh
doing a distributed project for my Masters with the aid of Hadoop.

Though young in my career, I do understand the complexities of distributed
programming through an association with OpenMP, MPI, Haskell as well as
Single Assignment C on a range of hardware technologies.

I have a reducer which at this stage has an output of 6 strings. I would
like to take a random value from that output pass it back to the driver and
subsequently the next map.

Would a record be the most sensible way to do this ?

Many thanks,,

Chris MacKenzie
 <http://www.chrismackenziephotography.co.uk/>

From:  Bertrand Dechoux <dechouxb@gmail.com>
Date:  Thursday, 3 July 2014 08:56
To:  Chris MacKenzie <studio@chrismackenziephotography.co.uk>
Cc:  "user@hadoop.apache.org" <user@hadoop.apache.org>
Subject:  Re: What is the correct way to get a string back from a mapper or
reducer

The stackoverflow question doesn't add any useful information.

Like I said you can emit the string inside a record. Or if you really want
to handle lots of complexity, write it yourself within a file or a datastore
from the reducer. But you will then have to consider performance issues and
be able to handle to lifecycle of the task, its potential multiple attempts
and the global lifecyle of the job itself. So it's not necessary obvious, it
would depend on the context.

The concept of "global variable" in distributed computing should be well
understood. By essence, its not possible to have a distributed,
always-available, always-consistent variable (see CAP).

Bertrand Dechoux


On Thu, Jul 3, 2014 at 7:51 AM, Chris MacKenzie
<studio@chrismackenziephotography.co.uk> wrote:
> Hi Bertrand,
> 
> Thank you for your quick response, I simply need a string returned from
> the reducer.
> 
> I have all over the place looking for a solution. I keep coming back to:
> http://stackoverflow.com/questions/16222205/use-global-variable-in-reudcer-
> class
> 
> Chris MacKenzie
> <http://www.chrismackenziephotography.co.uk/>
> 
> 
> 
> 
> From:  Bertrand Dechoux <dechouxb@gmail.com>
> Reply-To:  <user@hadoop.apache.org>
> Date:  Thursday, 3 July 2014 06:43
> To:  "user@hadoop.apache.org" <user@hadoop.apache.org>
> Subject:  Re: What is the correct way to get a string back from a mapper
> or reducer
> 
> 
> Configuration is from an architecture point of view immutable once the job
> is started even though the API does not reflect that explicitly.I would
> say in a record. But the question is : what do you want to achieve?
> 
> Bertrand Dechoux
> 
> 
> On Thu, Jul 3, 2014 at 7:37 AM, Chris MacKenzie
> <studio@chrismackenziephotography.co.uk> wrote:
> 
> Hi,
> 
> I have the following code and am using hadoop 2.4:
> 
> In my driver:
>         Configuration conf = new Configuration();
>         conf.set("sub", "help");
>         Š..
>         String s = conf.get("sub²);
> 
> In my reducer:
>         Configuration conf = context.getConfiguration();
>         conf.set("sub", "Test²);
> 
> When I test the value in the driver, it isn¹t updated following the reduce
> 
> 
> Best,
> 
> Chris MacKenzie
>  <http://www.chrismackenziephotography.co.uk/>
> 
> 
> 
> 
> 
> 
> 
> 




Mime
View raw message