hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Schless <patr...@tempo-db.com>
Subject CellCounter -- Exceeded limits on number of counters
Date Wed, 11 Jul 2012 15:26:03 GMT
I am trying to find out the number of data points (cells) in a table
with "hbase org.apache.hadoop.hbase.mapreduce.CellCounter <table>
<output>". on a very small table (3 cells), it works fine. On a table
with a couple thousand cells, I get this error (4 times):

org.apache.hadoop.mapred.Counters$CountersExceededException: Error:
Exceeded limits on number of counters - Counters=120 Limit=120
        at org.apache.hadoop.mapred.Counters$Group.getCounterForName(Counters.java:316)
        at org.apache.hadoop.mapred.Counters.findCounter(Counters.java:450)
        at org.apache.hadoop.mapred.Task$TaskReporter.getCounter(Task.java:590)
        at org.apache.hadoop.mapred.Task$TaskReporter.getCounter(Task.java:537)
        at org.apache.hadoop.mapreduce.TaskInputOutputContext.getCounter(TaskInputOutputContext.java:88)
        at org.apache.hadoop.hbase.mapreduce.CellCounter$CellCounterMapper.map(CellCounter.java:134)
        at org.apache.hadoop.hbase.mapreduce.CellCounter$CellCounterMapper.map(CellCounter.java:77)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)

I've been googling, but all I can find is stuff along the lines of "I
know this is bad, but how do I increase the counter limit?" For a
table with only a few thousand cells, I think I must be doing
something wrong if I'm hitting a limit of a built-in mapreduce job
(I'd expect it to run slower, rather than fail).

I'm running hbase 0.90.6 on a small test cluster (1 NN, 1 SNN, 1JT, 1
HMaster, 2 ZK, 3 slaves).

Is increasing the counter limit the right strategy here, or is there
some way to use CellCounter without changing that? In the "real"
cluster, there will be more slaves, but also quite a bit more data.

View raw message