hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Fw: reducers outputs
Date Mon, 30 Jan 2012 06:34:32 GMT
Aliyeh,

You may be complicating things here.

The HDFS and MapReduce are two separate components of Hadoop. HDFS
provides a distributed FileSystem, MapReduce provides a distributed
processing layer. They aren't glued.

A reducer creates an output file on a 'filesystem'. It does not know
nor care if its talking to HDFS or not, all it cares about is to run
the users' reducer functions, and persist the output to a filesystem
provided to it (may be HDFS, may be local, it does not matter to the
reducer who its talking to).

Have you gone over a regular tutorial of Hadoop to understand how
things work? Try taking a look at
http://hadoop.apache.org/common/docs/current/mapred_tutorial.html.

For overriding output filenames, in case you are looking for something
other than "part-xxxxx"  names, the easiest way is to use
MultipleOutputs with your custom named output, documented here:
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html

On Mon, Jan 30, 2012 at 11:39 AM, aliyeh saeedi <a1_saeedi@yahoo.com> wrote:
> I studied it, but I could not get the point. I mean if I save reducer's
> output with my own selected names, does NameNode behave with them like other
> files?
> regards.
>
> ________________________________
> From: Ashwanth Kumar <ashwanthkumar@googlemail.com>
>
> To: mapreduce-user@hadoop.apache.org; aliyeh saeedi <a1_saeedi@yahoo.com>
> Sent: Monday, 30 January 2012, 9:25
> Subject: Re: Fw: reducers outputs
>
> You should have a look at this -
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/FileOutputFormat.html
>
>  - Ashwanth Kumar
>
> On Mon, Jan 30, 2012 at 11:17 AM, aliyeh saeedi <a1_saeedi@yahoo.com> wrote:
>
>
>
>
> I want to save them with my own names, How NameNode will keep their names?
>
> ________________________________
> From: Joey Echeverria <joey@cloudera.com>
> To: mapreduce-user@hadoop.apache.org; aliyeh saeedi <a1_saeedi@yahoo.com>
> Sent: Sunday, 29 January 2012, 17:10
> Subject: Re: reducers outputs
>
> Reduce output is normally stored in HDFS, just like your other files.
> Are you seeing different behavior?
>
> -Joey
>
> On Sun, Jan 29, 2012 at 1:05 AM, aliyeh saeedi <a1_saeedi@yahoo.com> wrote:
>> Hi
>> I want to save reducers outputs like other files in Hadoop. Does NameNode
>> keep any information about them? How can I do this?
>> Or can I add a new component to Hadoop like NameNode and make JobTracker
>> to
>> consult with it too (I mean I want to make JobTracker to consult with
>> NameNode AND myNewComponent both)?
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>
>
>
>
>
>
>



-- 
Harsh J
Customer Ops. Engineer, Cloudera

Mime
View raw message