hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: client thread stuck on HBaseClient.call
Date Tue, 17 Jan 2012 22:13:56 GMT
That stack trace is really just a debug message left in the Hadoop
code (not even HBase!). Also it's surprising that we create a
Configuration there, but that's another issue...

So there's something weird with that row, or maybe the following rows
too? Could you start a scanner after that row and see if it completes?

Then when the scanner is stuck (I guess it fails on a
SocketTimeoutException after 60 seconds?) did you try doing a jstack
on the region server that's hosting the region? You could also try the
HFile too on that region and see what's going on with your data, look
at 8.7.5.2.2 under http://hbase.apache.org/book.html#regions.arch

Hope this helps,

J-D

On Sat, Jan 14, 2012 at 5:59 AM, Joel Halbert <joel@su3analytics.com> wrote:
> So in summary, using HBase 0.9.05, java 6 u30, standalone, any client,
> including the shell get's stuck at ~ record 10k.
> If I run shell> count 'table' it stalls at the 10k count.
>
> IIf I run HBase at Trace I see this in the logs, repeating, could it be
> related?
>
> 2012-01-14 13:57:05,883 DEBUG org.apache.hadoop.ipc.HBaseServer:  got #5
> 2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: PRI IPC
> Server handler 2 on 40160: has #5 from 127.0.0.1:52866
> 2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: Served:
> close queueTime= 0 procesingTime= 0
> 2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder: responding to #5 from 127.0.0.1:52866
> 2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder: responding to #5 from 127.0.0.1:52866 Wrote 8 bytes.
> 2012-01-14 13:57:05,903 DEBUG org.apache.hadoop.ipc.HBaseServer:  got #6
> 2012-01-14 13:57:05,904 DEBUG org.apache.hadoop.conf.Configuration:
> java.io.IOException: config()
>        at
> org.apache.hadoop.conf.Configuration.<init>(Configuration.java:211)
>        at
> org.apache.hadoop.conf.Configuration.<init>(Configuration.java:198)
>        at org.apache.hadoop.hbase.client.Scan.createForName(Scan.java:504)
>        at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:524)
>        at
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:555)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:127)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:978)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:946)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316)
>        at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
>
>
>
> On 14/01/12 11:28, Joel Halbert wrote:
>>
>> This problem appears to be unrelated to my use of a scanner, or the client
>> code.
>>
>> If in the hbase shell I run
>>
>> count 'table'
>>
>> it also gets stuck, at around record number  10,000,
>>
>> Is this a corrupted table? Is there any way to repair?
>>
>>
>>
>> On 13/01/12 23:03, Joel Halbert wrote:
>>>
>>> It always hangs waiting on the same record....
>>>
>>>
>>> On 13/01/12 22:48, Joel Halbert wrote:
>>>>
>>>> Successfully got a few thousand results....nothing exceptional in the
>>>> hbase log:
>>>>
>>>> |2012-01-13 22:42:13,830  INFO org.apache.hadoop.io.compress.CodecPool:
>>>>  Got  brand-new  decompressor
>>>> 2012-01-13 22:42:13,832  INFO org.apache.hadoop.io.compress.CodecPool:
>>>>  Got  brand-new  decompressor
>>>> 2012-01-13 22:42:32,580  DEBUG
>>>> org.apache.hadoop.hbase.io.hfile.LruBlockCache:  LRUStats:  total=332.03
>>>>  MB,  free=61.32  MB,  max=393.35  MB,  blocks=1524,  accesses=720942,
>>>>  hits=691565,  hitRatio=95.92%%,  cachingAccesses=720938,
>>>>  cachingHits=691565,  cachingHitsRatio=95.92%%,  evictions=149,
>>>>  evicted=27849,  evictedPerRun=186.90603637695312
>>>> 2012-01-13 22:42:36,222  DEBUG
>>>> org.apache.hadoop.hbase.master.LoadBalancer:  Server  information:
>>>>  localhost.localdomain,59902,1326492448413=15
>>>> 2012-01-13 22:42:36,223  INFO
>>>> org.apache.hadoop.hbase.master.LoadBalancer:  Skipping  load balancing.
>>>>  servers=1  regions=15  average=15.0  mostloaded=15  leastloaded=15
>>>> 2012-01-13 22:42:36,236  DEBUG
>>>> org.apache.hadoop.hbase.master.CatalogJanitor:  Scanned  14  catalog row(s)
>>>>  and gc'd0  unreferenced parent region(s)|
>>>>
>>>>
>>>>
>>>> On 13/01/12 22:46, T Vinod Gupta wrote:
>>>>>
>>>>> did u get any scan results at all?
>>>>> check your region server and master hbase logs for any warnings..
>>>>>
>>>>> also, just fyi - the standalone version of hbase is not super stable.
i
>>>>> have had many similar problems in the past. the distributed mode is
>>>>> much
>>>>> much robust.
>>>>>
>>>>> thanks
>>>>>
>>>>> On Fri, Jan 13, 2012 at 2:36 PM, Joel Halbert<joel@su3analytics.com>
>>>>>  wrote:
>>>>>
>>>>>> I have a standalone instance of HBASE (single instance, on localhost).
>>>>>>
>>>>>> After reading a few thousand records using a scanner my thread is
>>>>>> stuck
>>>>>> waiting:
>>>>>>
>>>>>> "main" prio=10 tid=0x00000000016d4800 nid=0xf3a in Object.wait()
>>>>>> [0x00007fbe96dc3000]
>>>>>>   java.lang.Thread.State: WAITING (on object monitor)
>>>>>>    at java.lang.Object.wait(Native Method)
>>>>>>    at java.lang.Object.wait(Object.**java:503)
>>>>>>    at org.apache.hadoop.hbase.ipc.**HBaseClient.call(HBaseClient.**
>>>>>> java:757)
>>>>>>    - locked<0x00000007e2ba21d0>  (a org.apache.hadoop.hbase.ipc.**
>>>>>> HBaseClient$Call)
>>>>>>    at org.apache.hadoop.hbase.ipc.**HBaseRPC$Invoker.invoke(**
>>>>>> HBaseRPC.java:257)
>>>>>>    at $Proxy4.next(Unknown Source)
>>>>>>    at org.apache.hadoop.hbase.**client.ScannerCallable.call(**
>>>>>> ScannerCallable.java:79)
>>>>>>    at org.apache.hadoop.hbase.**client.ScannerCallable.call(**
>>>>>> ScannerCallable.java:38)
>>>>>>    at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>> HConnectionImplementation.**getRegionServerWithRetries(**
>>>>>> HConnectionManager.java:1019)
>>>>>>    at org.apache.hadoop.hbase.**client.MetaScanner.metaScan(**
>>>>>> MetaScanner.java:182)
>>>>>>    at org.apache.hadoop.hbase.**client.MetaScanner.metaScan(**
>>>>>> MetaScanner.java:95)
>>>>>>    at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>> HConnectionImplementation.**prefetchRegionCache(**
>>>>>> HConnectionManager.java:649)
>>>>>>    at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>> HConnectionImplementation.**locateRegionInMeta(**
>>>>>> HConnectionManager.java:703)
>>>>>>    - locked<0x00000007906dfcf8>  (a java.lang.Object)
>>>>>>    at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>>
>>>>>> HConnectionImplementation.**locateRegion(**HConnectionManager.java:594)
>>>>>>    at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>>
>>>>>> HConnectionImplementation.**locateRegion(**HConnectionManager.java:559)
>>>>>>    at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>> HConnectionImplementation.**getRegionLocation(**
>>>>>> HConnectionManager.java:416)
>>>>>>    at
>>>>>> org.apache.hadoop.hbase.**client.ServerCallable.**instantiateServer(
>>>>>> **ServerCallable.java:57)
>>>>>>    at org.apache.hadoop.hbase.**client.ScannerCallable.**
>>>>>> instantiateServer(**ScannerCallable.java:63)
>>>>>>    at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>> HConnectionImplementation.**getRegionServerWithRetries(**
>>>>>> HConnectionManager.java:1018)
>>>>>>    at org.apache.hadoop.hbase.**client.HTable$ClientScanner.**
>>>>>> nextScanner(HTable.java:1104)
>>>>>>    at org.apache.hadoop.hbase.**client.HTable$ClientScanner.**
>>>>>> next(HTable.java:1196)
>>>>>>    at org.apache.hadoop.hbase.**client.HTable$ClientScanner$1.**
>>>>>> hasNext(HTable.java:1256)
>>>>>>    at crawler.cache.PageCache.**accept(PageCache.java:254)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Concretely, it is stuck on the iterator.next method:
>>>>>>
>>>>>>        Scan scan = new Scan(Bytes.toBytes(**hostnameTarget),
>>>>>> Bytes.toBytes(hostnameTarget + (char) 127));
>>>>>>        scan.setMaxVersions(1);
>>>>>>        scan.setCaching(4);
>>>>>>        ResultScanner resscan = table.getScanner(scan);
>>>>>>        Iterator<Result>  it = resscan.iterator();
>>>>>>        while (it.hasNext()) {              // stuck here
>>>>>>
>>>>>>
>>>>>>
>>>>>> Any clues?
>>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message