hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Silvio Di gregorio <silvio.digrego...@gmail.com>
Subject Re: List of Puts in mapreduce java job
Date Tue, 19 May 2015 19:12:13 GMT
Thanks to Shahab too for  the trick
Il 19/mag/2015 15:54 "Shahab Yunus" <shahab.yunus@gmail.com> ha scritto:

> TableMapReduceUtil.initTableReducerJob(inputTableName, Reduce.class,
> job);
>
> won't work because if you see the implementation of it:
>
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase-server/0.99.2/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java#TableMapReduceUtil.initTableReducerJob%28java.lang.String%2Cjava.lang.Class%2Corg.apache.hadoop.hbase.mapreduce.Job%2Cjava.lang.Class%2Cjava.lang.String%2Cjava.lang.String%2Cjava.lang.String%29
>
> It is expecting TableReducer which as we know takes Mutation as it output.
>
> To handle that, you need to define your own initTableReducerJob. I have
> done something like this by copying the code from the original method and
> just changing the expected 'TableReducer' to 'Reducer'. Something on these
> lines below (the main change is the bold):
>
> public static void initTableReducerJobBatchPuts(String[] tables,
> *                Class<? extends Reducer> reducer*, Job job, Class
> partitioner,
>                 String quorumAddress, String serverClass, String
> serverImpl,
>                 boolean addDependencyJars, Durability d) throws IOException
>         {
>
>             Configuration conf = job.getConfiguration();
>             HBaseConfiguration.merge(conf,
> HBaseConfiguration.create(conf));
>
>  job.setOutputFormatClass(RPDMultiTableOutputFormatBatchPuts.class);
>
>             if (reducer != null)
>             {
>                 job.setReducerClass(reducer);
>             }
> ...
> ...
> ..
>
> Regards,
> Shahab
>
>
>
> On Tue, May 19, 2015 at 9:41 AM, Silvio Di gregorio <
> silvio.digregorio@gmail.com> wrote:
>
> > I'm tryed this way
> >
> >      public static class Reduce extends Reducer<IntWritable, Text,
> > ImmutableBytesWritable, List<Put>> {
> >
> > and in the main
> >
> >     TableMapReduceUtil.initTableReducerJob(inputTableName, null, job);
> > instead of
> >      TableMapReduceUtil.initTableReducerJob(inputTableName, Reduce.class,
> > job);
> >
> > In fact, now is a Reduce Clas not a tableReducer, but i have the same
> error
> >
> >
> > Error: java.lang.ClassCastException: java.util.ArrayList cannot be cast
> to
> > org.apache.hadoop.hbase.client.Mutation
> >
> > 2015-05-19 15:14 GMT+02:00 Shahab Yunus <shahab.yunus@gmail.com>:
> >
> > > The error is highlighting the issue.
> > >
> > > You can't output List of Puts like this. Your reducer output is
> Mutation
> > > and NOT a list of Mutation.
> > >
> > > I have handled this scenario by defining my own base abstract class:
> > >
> > >
> > > *public* *abstract* *class* TableReducerBatchPuts<KEYIN, VALUEIN,
> KEYOUT>
> > > *extends* Reducer<KEYIN, VALUEIN, KEYOUT, List<Put>> {
> > > ...
> > > And then using this to  implement by reducer by extending this. You can
> > do
> > > something similar, perhaps?
> > >
> > > Regards,
> > > Shahab
> > >
> > > On Tue, May 19, 2015 at 9:05 AM, Silvio Di gregorio <
> > > silvio.digregorio@gmail.com> wrote:
> > >
> > > > Hi
> > > > I'm trying to emit, on reduce phase, a list of puts
> > > >
> > > > *context.write(null , puts);*
> > > >
> > > > puts is
> > > >
> > > > *List<Put> puts=new ArrayList<Put>();*
> > > >
> > > > and the Reduce signature is:
> > > >
> > > > *public static class Reduce extends TableReducer<IntWritable, Text,
> > > > ImmutableBytesWritable>{*
> > > >
> > > > this is the error
> > > >
> > > > *The method write(ImmutableBytesWritable, Mutation) in the type
> > > >
> > TaskInputOutputContext<IntWritable,Text,ImmutableBytesWritable,Mutation>
> > > is
> > > > not applicable for the arguments (null, List<Put>)*
> > > >
> > > > thanks
> > > > silvio
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message