hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Solomon Duskis <sdus...@gmail.com>
Subject Re: List of Puts in mapreduce java job
Date Tue, 19 May 2015 13:53:02 GMT
It looks like you're using hbase 1.0 based on the fact that you're getting
a cast to Mutation rather than to Put; is that right?

There isn't an advantage of doing a write(putList) vs. a write(singlePut).
Under the covers, the context.wirte() does a single mutation, but doesn't
actually send that mutation to the server.  The context.write() will save
that single put into a buffer, and will allow the map/reduce to continue.
Once that buffer is "full" (which usually means gets to be 2MB in size),
that buffer will be set asynchronously to the server.

You have to do the following:

for(Put put: puts){
  context.write(null, put);
}

You won't be able to get the default implementation of hbase map/reduces to
take a list.  The hbase map/reduce single put implementation is just as
efficient, if not more efficient, as put(list) for long running map/reduce
jobs.

I hope this helps.

On Tue, May 19, 2015 at 9:41 AM, Silvio Di gregorio <
silvio.digregorio@gmail.com> wrote:

> I'm tryed this way
>
>      public static class Reduce extends Reducer<IntWritable, Text,
> ImmutableBytesWritable, List<Put>> {
>
> and in the main
>
>     TableMapReduceUtil.initTableReducerJob(inputTableName, null, job);
> instead of
>      TableMapReduceUtil.initTableReducerJob(inputTableName, Reduce.class,
> job);
>
> In fact, now is a Reduce Clas not a tableReducer, but i have the same error
>
>
> Error: java.lang.ClassCastException: java.util.ArrayList cannot be cast to
> org.apache.hadoop.hbase.client.Mutation
>
> 2015-05-19 15:14 GMT+02:00 Shahab Yunus <shahab.yunus@gmail.com>:
>
> > The error is highlighting the issue.
> >
> > You can't output List of Puts like this. Your reducer output is Mutation
> > and NOT a list of Mutation.
> >
> > I have handled this scenario by defining my own base abstract class:
> >
> >
> > *public* *abstract* *class* TableReducerBatchPuts<KEYIN, VALUEIN, KEYOUT>
> > *extends* Reducer<KEYIN, VALUEIN, KEYOUT, List<Put>> {
> > ...
> > And then using this to  implement by reducer by extending this. You can
> do
> > something similar, perhaps?
> >
> > Regards,
> > Shahab
> >
> > On Tue, May 19, 2015 at 9:05 AM, Silvio Di gregorio <
> > silvio.digregorio@gmail.com> wrote:
> >
> > > Hi
> > > I'm trying to emit, on reduce phase, a list of puts
> > >
> > > *context.write(null , puts);*
> > >
> > > puts is
> > >
> > > *List<Put> puts=new ArrayList<Put>();*
> > >
> > > and the Reduce signature is:
> > >
> > > *public static class Reduce extends TableReducer<IntWritable, Text,
> > > ImmutableBytesWritable>{*
> > >
> > > this is the error
> > >
> > > *The method write(ImmutableBytesWritable, Mutation) in the type
> > >
> TaskInputOutputContext<IntWritable,Text,ImmutableBytesWritable,Mutation>
> > is
> > > not applicable for the arguments (null, List<Put>)*
> > >
> > > thanks
> > > silvio
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message