hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shahab Yunus <shahab.yu...@gmail.com>
Subject Re: List of Puts in mapreduce java job
Date Tue, 19 May 2015 13:54:34 GMT
TableMapReduceUtil.initTableReducerJob(inputTableName, Reduce.class,
job);

won't work because if you see the implementation of it:
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase-server/0.99.2/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java#TableMapReduceUtil.initTableReducerJob%28java.lang.String%2Cjava.lang.Class%2Corg.apache.hadoop.hbase.mapreduce.Job%2Cjava.lang.Class%2Cjava.lang.String%2Cjava.lang.String%2Cjava.lang.String%29

It is expecting TableReducer which as we know takes Mutation as it output.

To handle that, you need to define your own initTableReducerJob. I have
done something like this by copying the code from the original method and
just changing the expected 'TableReducer' to 'Reducer'. Something on these
lines below (the main change is the bold):

public static void initTableReducerJobBatchPuts(String[] tables,
*                Class<? extends Reducer> reducer*, Job job, Class
partitioner,
                String quorumAddress, String serverClass, String serverImpl,
                boolean addDependencyJars, Durability d) throws IOException
        {

            Configuration conf = job.getConfiguration();
            HBaseConfiguration.merge(conf, HBaseConfiguration.create(conf));

 job.setOutputFormatClass(RPDMultiTableOutputFormatBatchPuts.class);

            if (reducer != null)
            {
                job.setReducerClass(reducer);
            }
...
...
..

Regards,
Shahab



On Tue, May 19, 2015 at 9:41 AM, Silvio Di gregorio <
silvio.digregorio@gmail.com> wrote:

> I'm tryed this way
>
>      public static class Reduce extends Reducer<IntWritable, Text,
> ImmutableBytesWritable, List<Put>> {
>
> and in the main
>
>     TableMapReduceUtil.initTableReducerJob(inputTableName, null, job);
> instead of
>      TableMapReduceUtil.initTableReducerJob(inputTableName, Reduce.class,
> job);
>
> In fact, now is a Reduce Clas not a tableReducer, but i have the same error
>
>
> Error: java.lang.ClassCastException: java.util.ArrayList cannot be cast to
> org.apache.hadoop.hbase.client.Mutation
>
> 2015-05-19 15:14 GMT+02:00 Shahab Yunus <shahab.yunus@gmail.com>:
>
> > The error is highlighting the issue.
> >
> > You can't output List of Puts like this. Your reducer output is Mutation
> > and NOT a list of Mutation.
> >
> > I have handled this scenario by defining my own base abstract class:
> >
> >
> > *public* *abstract* *class* TableReducerBatchPuts<KEYIN, VALUEIN, KEYOUT>
> > *extends* Reducer<KEYIN, VALUEIN, KEYOUT, List<Put>> {
> > ...
> > And then using this to  implement by reducer by extending this. You can
> do
> > something similar, perhaps?
> >
> > Regards,
> > Shahab
> >
> > On Tue, May 19, 2015 at 9:05 AM, Silvio Di gregorio <
> > silvio.digregorio@gmail.com> wrote:
> >
> > > Hi
> > > I'm trying to emit, on reduce phase, a list of puts
> > >
> > > *context.write(null , puts);*
> > >
> > > puts is
> > >
> > > *List<Put> puts=new ArrayList<Put>();*
> > >
> > > and the Reduce signature is:
> > >
> > > *public static class Reduce extends TableReducer<IntWritable, Text,
> > > ImmutableBytesWritable>{*
> > >
> > > this is the error
> > >
> > > *The method write(ImmutableBytesWritable, Mutation) in the type
> > >
> TaskInputOutputContext<IntWritable,Text,ImmutableBytesWritable,Mutation>
> > is
> > > not applicable for the arguments (null, List<Put>)*
> > >
> > > thanks
> > > silvio
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message