hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandy Ryza <sandy.r...@cloudera.com>
Subject Re: Map reduce technique
Date Wed, 06 Mar 2013 05:50:17 GMT
Hi Balachandar,

In MapReduce, interpreting input files as key value pairs is accomplished
through InputFormats.  Some common InputFormats are TextInputFormat, which
uses lines in a text file as values and their byte offset into the file as
keys, KeyValueTextInputFormat, which interprets the first token on a line
as the key and the rest as the value, and WholeFileInputFormat, which uses
an entire line as a value.  If you wanted to process an image file in a
specific way, you would probably need to supply your own InputFormat.

Does that help?


On Tue, Mar 5, 2013 at 9:37 PM, AMARNATH, Balachandar <
BALACHANDAR.AMARNATH@airbus.com> wrote:

>  Hi,
> I am new to map reduce paradigm. I read in a tutorial that says that ‘map’
> function splits the data and into key value pairs. This means, the
> map-reduce framework automatically splits the data into pieces or do we
> need to explicitly provide the method to split the data into pieces. If it
> does automatically, how it splits an image file (size etc)? I see,
> processing of an image file as a whole will give different results than
> processing them in chunks.
> With thanks and regards
> Balachandar
> The information in this e-mail is confidential. The contents may not be disclosed or
used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised.
> If you are not the intended recipient, please notify Airbus immediately and delete this
> Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail
as it has been sent over public networks. If you have any concerns over the content of this
message or its Accuracy or Integrity, please contact Airbus immediately.
> All outgoing e-mails from Airbus are checked using regularly updated virus scanning software
but you should take whatever measures you deem to be appropriate to ensure that this message
and any attachments are virus free.

View raw message