hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Sela <am...@infolinks.com>
Subject Re: Generic output key class
Date Mon, 11 Feb 2013 07:21:27 GMT
If I'm running only one MapReduce job then IntWritable output is OK but if
I'm running several together and some are Text output, I don't want to have
duplicate MapReduce jobs for different output types, I'm trying to find a
more generic solution...

On Mon, Feb 11, 2013 at 3:18 AM, Michael Segel <michael_segel@hotmail.com>wrote:

> Why not just write out the int as a numeric string?
>
> On Feb 10, 2013, at 1:07 PM, Sandy Ryza <sandy.ryza@cloudera.com> wrote:
>
> Hi Amit,
>
> One way to accomplish this would be to create a custom writable
> implementation, TextOrIntWritable, that has fields for both.  It could look
> something like:
>
> class TextOrIntWritable implements Writable {
>   private boolean isText;
>   private Text text;
>   private IntWritable integer;
>
>   void writeFields(DataOutput out) {
>     out.writeBoolean(isText);
>     if (isText) {
>       text.writeFields(out);
>     } else {
>       integer.writeFields(out);
>     }
>   }
>
>   [... readFields method that works in a similar way]
> }
>
> -Sandy
>
> On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <amits@infolinks.com> wrote:
>
>> Hi all,
>>
>> Has anyone ever used some kind of a "generic output key" for a mapreduce
>> job ?
>>
>> I have a job running multiple tasks and I want them to be able to use
>> both Text and IntWritable as output key classes.
>>
>> Any suggestions ?
>>
>> Thanks,
>>
>> Amit.
>>
>
>
> Michael Segel  <msegel@segel.com> | (m) 312.755.9623****
>
> Segel and Associates****
>
>

Mime
View raw message