hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sridhar Raman" <sridhar.ra...@gmail.com>
Subject Re: Input and Output types?
Date Thu, 01 May 2008 12:51:15 GMT
Thanks.

On Fri, Apr 18, 2008 at 9:14 PM, Owen O'Malley <oom@yahoo-inc.com> wrote:

>
> On Apr 17, 2008, at 11:20 PM, Sridhar Raman wrote:
>
>  I am new to MapReduce and Hadoop, and I have managed to find my way
> > through
> > with a few programs.  But I still have some doubts that are constantly
> > clinging onto me.  I am not too sure whether these are basic doubts, or
> > just
> > some documentation that I missed somewhere.
> >
>
> Take a look at  http://tinyurl.com/4y7776 under InputFormats.
>
>  1)  Should my input _always_ be text files?  What if my input is in the
> > form
> > of Java objects?  Where do I handle this conversion?
> >
>
> You can define your own InputFormat that reads an arbitrary format, or use
> SequenceFileInputFormat that reads SequenceFiles. SequenceFiles are a file
> format defined by Hadoop to hold binary data consisting of Writable keys and
> values.
>
>  2)  How do I control how the output is written?  For example, if I want
> > to
> > output in a format that is my own, how do I do it?
> >
>
> That is controlled by the OutputFormat. It defaults to TextOutputFormat,
> but you can either use SequenceFileOutputFormat or make your own.
>
> -- Owen
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message