hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vivek Krishna <vivekris...@gmail.com>
Subject Re: Row Counters
Date Wed, 16 Mar 2011 22:40:24 GMT
Works. Thanks.
Viv



On Wed, Mar 16, 2011 at 6:21 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> The connection loss was due to inability of finding zookeeper quorum
>
> Use the commandline in my previous email.
>
>
> On Wed, Mar 16, 2011 at 3:18 PM, Vivek Krishna <vivekrishna@gmail.com>wrote:
>
>> Oops. sorry about the environment.
>>
>> I am using hadoop-0.20.2-CDH3B4, and hbase-0.90.1-CDH3B4
>> and zookeeper-3.3.2-CDH3B4.
>>
>> I was able to configure jars and run the command,
>>
>> hadoop jar /usr/lib/hbase/hbase-0.90.1-CDH3B4.jar rowcounter test,
>>
>> but I get
>>
>> java.io.IOException: Cannot create a record reader because of a previous error. Please
look at the previous logs lines from the task's full log for more details.
>> 	at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:98)
>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:396)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>> 	at org.apache.hadoop.mapred.Child.main(Child.java:234)
>>
>>
>> The previous error in the task's full log is ..
>>
>>
>> 2011-03-16 21:41:03,367 ERROR org.apache.hadoop.hbase.mapreduce.TableInputFormat:
org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.hadoop.hbase.ZooKeeperConnectionException:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase
>> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:988)
>> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:301)
>> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:292)
>> 	at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:155)
>> 	at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:167)
>> 	at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:145)
>> 	at org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:91)
>> 	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
>> 	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:605)
>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:396)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>> 	at org.apache.hadoop.mapred.Child.main(Child.java:234)
>> Caused by: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase
>> 	at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:147)
>> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:986)
>> 	... 15 more
>> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode
= ConnectionLoss for /hbase
>> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>> 	at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
>> 	at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
>> 	at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
>> 	... 16 more
>>
>>
>> find I am pretty sure zookeeper master is running in the same machine at
>> port 2181.  Not sure why the connection loss occurs.  Do I need
>> HBASE-3578 <https://issues.apache.org/jira/browse/HBASE-3578> by any
>> chance?
>>
>> Viv
>>
>>
>>
>>
>> On Wed, Mar 16, 2011 at 5:36 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>
>>> In the future, describe your environment a bit.
>>>
>>> The way I approach this is:
>>> find the correct commandline from
>>> src/main/java/org/apache/hadoop/hbase/mapreduce/package-info.java
>>>
>>> Then I issue:
>>> [hadoop@us01-ciqps1-name01 hbase]$
>>> HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase
>>> classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.1.jar
>>> rowcounter packageindex
>>>
>>> Then I check the map/reduce task on job tracker URL
>>>
>>> On Wed, Mar 16, 2011 at 1:59 PM, Vivek Krishna <vivekrishna@gmail.com
>>> >wrote:
>>>
>>> > I guess it is using the mapred class
>>> >
>>> > 11/03/16 20:58:27 INFO mapred.JobClient: Task Id :
>>> > attempt_201103161245_0005_m_000004_0, Status : FAILED
>>> > java.io.IOException: Cannot create a record reader because of a
>>> previous
>>> > error. Please look at the previous logs lines from the task's full log
>>> for
>>> > more details.
>>> >  at
>>> >
>>> >
>>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:98)
>>> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
>>> >  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>>> > at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
>>> >  at java.security.AccessController.doPrivileged(Native Method)
>>> > at javax.security.auth.Subject.doAs(Subject.java:396)
>>> >  at
>>> >
>>> >
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>>> > at org.apache.hadoop.mapred.Child.main(Child.java:234)
>>> >
>>> > How do I use mapreduce class?
>>> > Viv
>>> >
>>> >
>>> >
>>> > On Wed, Mar 16, 2011 at 4:52 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>> >
>>> > > Since we have lived so long without this information, I guess we can
>>> hold
>>> > > for longer :-)
>>> > > Another issue I am working on is to reduce memory footprint. See the
>>> > > following discussion thread:
>>> > > One of the regionserver aborted, then the master shut down itself
>>> > >
>>> > > We have to bear in mind that there would be around 10K regions or
>>> more in
>>> > > production.
>>> > >
>>> > > Cheers
>>> > >
>>> > > On Wed, Mar 16, 2011 at 1:46 PM, Jeff Whiting <jeffw@qualtrics.com>
>>> > wrote:
>>> > >
>>> > > > Just a random thought.  What about keeping a per region row count?
>>> >  Then
>>> > > if
>>> > > > you needed to get a row count for a table you'd just have to query
>>> each
>>> > > > region once and sum.  Seems like it wouldn't be too expensive
>>> because
>>> > > you'd
>>> > > > just have a row counter variable.  It maybe more complicated than
>>> I'm
>>> > > making
>>> > > > it out to be though...
>>> > > >
>>> > > > ~Jeff
>>> > > >
>>> > > >
>>> > > > On 3/16/2011 2:40 PM, Stack wrote:
>>> > > >
>>> > > >> On Wed, Mar 16, 2011 at 1:35 PM, Vivek Krishna<
>>> vivekrishna@gmail.com>
>>> > > >>  wrote:
>>> > > >>
>>> > > >>> 1.  How do I count rows fast in hbase?
>>> > > >>>
>>> > > >>> First I tired count 'test'  , takes ages.
>>> > > >>>
>>> > > >>> Saw that I could use RowCounter, but looks like it is
deprecated.
>>> > > >>>
>>> > > >> It is not.  Make sure you are using the one from mapreduce
package
>>> as
>>> > > >> opposed to mapred package.
>>> > > >>
>>> > > >>
>>> > > >>  I just need to verify the total counts.  Is it possible to
see
>>> > > somewhere
>>> > > >>> in
>>> > > >>> the web interface or ganglia or by any other means?
>>> > > >>>
>>> > > >>>  We don't keep a current count on a table.  Too expensive.
 Run
>>> the
>>> > > >> rowcounter MR job.  This page may be of help:
>>> > > >>
>>> > > >>
>>> > >
>>> >
>>> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description
>>> > > >>
>>> > > >> Good luck,
>>> > > >> St.Ack
>>> > > >>
>>> > > >
>>> > > > --
>>> > > > Jeff Whiting
>>> > > > Qualtrics Senior Software Engineer
>>> > > > jeffw@qualtrics.com
>>> > > >
>>> > > >
>>> > >
>>> >
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message