hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Kubes <nutch-...@dragonflymc.com>
Subject Re: SequenceFile (Text,Text) becomes plain text
Date Fri, 02 Feb 2007 22:18:25 GMT
You need to set the input format of the second job.  It defaults to 
TextInputFormat which is why you are seeing it become text.  Use a line 
like below in the second job.

secondjob.setInputFormat(SequenceFileInputFormat.class);
secondjob.setInputKeyClass(Text.class);
secondjob.setInputValueClass(Text.class);

Dennis Kubes

Alejandro Abdelnur wrote:
> I may be missing something silly here,
> 
> I have a MR that generates an output type (Text,Text)
> 
> Consuming that output for another MR it becomes a plain text file thus the
> input is (LongWriteable, Text) with the long key being the line number and
> the text value is the key+value separated by a tab and my second MR blow as
> it was expecting (Text,Text) plus that the key is wrong.
> 
> Doing a cat of the file I see it become a flat file with lines having "key
> \t value".
> 
> How can I force the output of the first MR to remain a sequence file of
> (Text, Text)?
> 
> Thxs.
> 
> A
> 

Mime
View raw message