hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Something Something <mailinglist...@gmail.com>
Subject Re: Reducer that outputs no key
Date Fri, 24 May 2013 16:20:55 GMT
You can ignore this for now.  I was able to get merging of files to work
under Hadoop Streaming by using the following 2 properties:

-mapper "cut -f2-"
-Dmapred.reduce.tasks=0


On Fri, May 24, 2013 at 12:55 AM, Something Something <
mailinglists19@gmail.com> wrote:

> Hello,
>
> Trying to use Hadoop Streaming to create output that contains no key -
> just value.
>
> Here's what I am trying:
>
> 1)  Created IdentifierResolver as follows:
>
> public class MyIdentifierResolver extends IdentifierResolver {
>
>     public void resolve(String identifier) {
>         System.out.println("Entered resolve with identifier: " +
> identifier);
>         super.resolve(identifier);
>         if (identifier.equals("NullWritable")) {
>             System.out.println("Setting output key class to NullWritable");
>             setOutputKeyClass(NullWritable.class);
>         }
>     }
>
>
> 2)  Set the properties as follows:
>
> -Dstream.io.identifier.resolver.class=com.my.package.MyIdentifierResolver \
> -Dstream.map.output=NullWritable \
> -Dstream.reduce.output=NullWritable
>
>
> This should work right?  But it's still writing the 'key' in the output.
> Is there a better way to do this in Hadoop?
>
> Note:  Basically, we are trying to merge files (over 2000) into smaller
> number of files (e.g. 500).  The files are too big so 'getmerge' does not
> work 'cause we run into space issues.
>
> Please help.  Thanks.
>

Mime
View raw message