hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anil gupta <anilgupt...@gmail.com>
Subject Re: BigDecimalColumnInterpreter
Date Wed, 05 Sep 2012 21:04:11 GMT
Hi Julian,

I have been running the same class on my distributed cluster for
aggregation. It has been working fine. The only difference is that i use
the methods provided incom.intuit.ihub.hbase.poc.aggregation.client.
AggregationClient<eclipse-javadoc:%E2%98%82=Hbase_cdh4/src%3Ccom.intuit.ihub.hbase.poc.aggregation.client%7BAggregationClient.java%E2%98%83AggregationClient>class.
IMHO, you don't need to define an Endpoint for using the
BigDecimalColumnInterpreter.

You need to use methods of AggregationClient: sum(Bytes.toBytes(tableName),
ci, scan), avg(final byte[] tableName, final ColumnInterpreter<R, S> ci,
Scan scan), etc.

In the above method you just need to pass the BigDecimalColumnInterpreter,
Scan object and Byte Array of TableName. It should work. Let me know if it
doesn't work this way.*
*
Thanks,
Anil Gupta*
*

On Wed, Sep 5, 2012 at 1:30 PM, Julian Wissmann <julian.wissmann@sdace.de>wrote:

> Thank you!
> So this looks like the missing link here.
> I'll see if I can get it working, tomorrow morning.
>
> Cheers
>
> 2012/9/5 Ted Yu <yuzhihong@gmail.com>
>
> > I added one review comment on
> > HBASE-6669<https://issues.apache.org/jira/browse/HBASE-6669>
> > .
> >
> > Thanks Julian for reminding me.
> >
> > On Wed, Sep 5, 2012 at 12:49 PM, Julian Wissmann
> > <julian.wissmann@sdace.de>wrote:
> >
> > > I get supplied with doubles from sensors, but in the end I loose too
> much
> > > precision if I do my aggregations on double, otherwise I'd go for it.
> > > I use 0.92.1, from Cloudera CDH4.
> > > I've done some initial testing with LongColumnInterpreter on a dataset
> > that
> > > I've generated, to do some testing and get accustomed to stuff, but
> that
> > > worked like a charm after some initial stupidity on my side.
> > > So now I'm trying to do some testing with the real data, which comes in
> > as
> > > double and gets parsed to BigDecimal before writing.
> > >
> > > 2012/9/5 Ted Yu <yuzhihong@gmail.com>
> > >
> > > > And your HBase version is ?
> > > >
> > > > Since you use Double.parseDouble(), looks like it would be more
> > efficient
> > > > to develop DoubleColumnInterpreter.
> > > >
> > > > On Wed, Sep 5, 2012 at 12:07 PM, Julian Wissmann
> > > > <julian.wissmann@sdace.de>wrote:
> > > >
> > > > > Hi,
> > > > > the schema looks like this:
> > > > > RowKey: id,timerange_timestamp,offset (String)
> > > > > Qualifier: Offset (long)
> > > > > Timestamp: timestamp (long)
> > > > > Value:number (BigDecimal)
> > > > >
> > > > > Or as code when I read data from csv:byte[] value =
> > > > > Bytes.toBytes(BigDecimal.valueOf(Double.parseDouble(cData[2])));
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Julian
> > > > >
> > > > > 2012/9/5 Ted Yu <yuzhihong@gmail.com>
> > > > >
> > > > > > You haven't told us the schema of your table yet.
> > > > > > Your table should have column whose value can be interpreted
by
> > > > > > BigDecimalColumnInterpreter.
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > > On Wed, Sep 5, 2012 at 9:17 AM, Julian Wissmann <
> > > > > julian.wissmann@sdace.de
> > > > > > >wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I am currently experimenting with the
> BigDecimalColumnInterpreter
> > > > from
> > > > > > > https://issues.apache.org/jira/browse/HBASE-6669.
> > > > > > >
> > > > > > > I was thinking the best way for me to work with it would
be to
> > use
> > > > the
> > > > > > Java
> > > > > > > class and just use that as is.
> > > > > > >
> > > > > > > Imported it into my project and tried to work with it as
is, by
> > > just
> > > > > > > instantiating the ColumnInterpreter as
> > BigDecimalColumnInterpreter.
> > > > > Okay,
> > > > > > > threw errors and also complained about not knowing where
to
> find
> > > > such a
> > > > > > > class.
> > > > > > >
> > > > > > > So I did some reading and found out, that I'd need to have
an
> > > > Endpoint
> > > > > > for
> > > > > > > it. So I imported AggregateImplementation and AggregateProtocol
> > > into
> > > > my
> > > > > > > workspace, renamed them, and refactored them where necessary
to
> > > take
> > > > > > > BigDecimal. Re-exported the jar, then and had another try.
> > > > > > >
> > > > > > > So when I call:
> > > > > > > ------
> > > > > > > final Scan scan = new Scan((metricID + "," +
> > > > > basetime_begin).getBytes(),
> > > > > > > (metricID + "," + basetime_end).getBytes());
> > > > > > > scan.addFamily(family.getBytes());
> > > > > > > final ColumnInterpreter<BigDecimal, BigDecimal> ci
= new
> > > > > > > BigDecimalColumnInterpreter();
> > > > > > > Map<byte[], BigDecimal> results =
> > > > > > > table.coprocessorExec(BigDecimalProtocol.class, null, null,
> > > > > > >     new Batch.Call<BigDecimalProtocol,BigDecimal>()
{
> > > > > > >       public BigDecimal call(BigDecimalProtocol instance)throws
> > > > > > > IOException{
> > > > > > >         return instance.getMax(ci, scan);
> > > > > > >       }
> > > > > > >     });
> > > > > > > ------
> > > > > > > I get errors in the log again, that it can't find
> > > > > > > BigDecimalColumnInterpreter... okay, so I tried
> > > > > > > ------
> > > > > > > Scan scan = new Scan((metricID + "," +
> > basetime_begin).getBytes(),
> > > > > > > (metricID + "," + basetime_end).getBytes());
> > > > > > > scan.addFamily(family.getBytes());
> > > > > > > final ColumnInterpreter<BigDecimal, BigDecimal> ci
= new
> > > > > > > BigDecimalColumnInterpreter();
> > > > > > > AggregationClient ag = new AggregationClient(config);
> > > > > > > BigDecimal max = ag.max(Bytes.toBytes(tableName), ci, scan);
> > > > > > > ------
> > > > > > > I don't get errors recored in the log anymore, but a load
of
> Java
> > > > error
> > > > > > > output:
> > > > > > > ------
> > > > > > >
> > > > > > > java.util.concurrent.ExecutionException:
> > > > > > > org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Failed
> > > > after
> > > > > > > attempts=10, exceptions:
> > > > > > > Wed Sep 05 18:13:43 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:44 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:45 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:46 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:49 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:51 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:55 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:59 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:14:07 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:14:23 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > >
> > > > > > > at
> > > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
> > > > > > > at java.util.concurrent.FutureTask.get(FutureTask.java:111)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processExecs(HConnectionManager.java:1434)
> > > > > > > at
> > > > > >
> > > org.apache.hadoop.hbase.client.HTable.coprocessorExec(HTable.java:1263)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.client.coprocessor.AggregationClient.sum(AggregationClient.java:259)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> helpers.HbaseReaderBigDecimal.getWeeksumSCAN(HbaseReaderBigDecimal.java:360)
> > > > > > > at
> > > helpers.HbaseReaderBigDecimal.main(HbaseReaderBigDecimal.java:81)
> > > > > > > Caused by:
> > > org.apache.hadoop.hbase.client.RetriesExhaustedException:
> > > > > > Failed
> > > > > > > after attempts=10, exceptions:
> > > > > > > Wed Sep 05 18:13:43 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:44 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:45 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:46 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:49 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:51 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:55 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:59 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:14:07 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:14:23 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > >
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
> > > > > > > at
> > > > > > >
> > > > >
> > >
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
> > > > > > > at $Proxy7.getSum(Unknown Source)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.client.coprocessor.AggregationClient$4.call(AggregationClient.java:263)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.client.coprocessor.AggregationClient$4.call(AggregationClient.java:260)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1422)
> > > > > > > at
> > > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> > > > > > > at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> > > > > > > at java.lang.Thread.run(Thread.java:679)
> > > > > > > org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Failed
> > > > after
> > > > > > > attempts=10, exceptions:
> > > > > > > Wed Sep 05 18:13:43 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:44 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:45 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:46 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:49 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:51 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:55 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:13:59 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:14:07 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > > Wed Sep 05 18:14:23 CEST 2012,
> > > > > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@50502819,
> > > > > > > java.io.IOException:
> > > > > > > IPC server unable to read call parameters: Error in readFields
> > > > > > >
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
> > > > > > > at
> > > > > > >
> > > > >
> > >
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79)
> > > > > > > at $Proxy7.getSum(Unknown Source)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.client.coprocessor.AggregationClient$4.call(AggregationClient.java:263)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.client.coprocessor.AggregationClient$4.call(AggregationClient.java:260)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1422)
> > > > > > > at
> > > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> > > > > > > at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> > > > > > > at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> > > > > > > at java.lang.Thread.run(Thread.java:679)
> > > > > > >
> > > > > > > ------
> > > > > > >
> > > > > > > I'm not really sure about what I'm doing wrong. Does anyone
> have
> > a
> > > > hint
> > > > > > > towards the right direction?
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 
Thanks & Regards,
Anil Gupta

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message