hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Wolf <opus...@gmail.com>
Subject Re: client thread stuck on HBaseClient.call
Date Wed, 18 Jan 2012 02:28:59 GMT
I am also getting stuck using 0.9.05.  I am just doing a simple scan in 
Java, and it sometimes hangs iterating in the scanner.  I am also seeing 
my 'hbase shell' get stuck while counting rows, and doing simple queries.

I'm not doing anything fancy.

P



On 1/17/12 5:13 PM, Jean-Daniel Cryans wrote:
> That stack trace is really just a debug message left in the Hadoop
> code (not even HBase!). Also it's surprising that we create a
> Configuration there, but that's another issue...
>
> So there's something weird with that row, or maybe the following rows
> too? Could you start a scanner after that row and see if it completes?
>
> Then when the scanner is stuck (I guess it fails on a
> SocketTimeoutException after 60 seconds?) did you try doing a jstack
> on the region server that's hosting the region? You could also try the
> HFile too on that region and see what's going on with your data, look
> at 8.7.5.2.2 under http://hbase.apache.org/book.html#regions.arch
>
> Hope this helps,
>
> J-D
>
> On Sat, Jan 14, 2012 at 5:59 AM, Joel Halbert<joel@su3analytics.com>  wrote:
>> So in summary, using HBase 0.9.05, java 6 u30, standalone, any client,
>> including the shell get's stuck at ~ record 10k.
>> If I run shell>  count 'table' it stalls at the 10k count.
>>
>> IIf I run HBase at Trace I see this in the logs, repeating, could it be
>> related?
>>
>> 2012-01-14 13:57:05,883 DEBUG org.apache.hadoop.ipc.HBaseServer:  got #5
>> 2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: PRI IPC
>> Server handler 2 on 40160: has #5 from 127.0.0.1:52866
>> 2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: Served:
>> close queueTime= 0 procesingTime= 0
>> 2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: IPC Server
>> Responder: responding to #5 from 127.0.0.1:52866
>> 2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: IPC Server
>> Responder: responding to #5 from 127.0.0.1:52866 Wrote 8 bytes.
>> 2012-01-14 13:57:05,903 DEBUG org.apache.hadoop.ipc.HBaseServer:  got #6
>> 2012-01-14 13:57:05,904 DEBUG org.apache.hadoop.conf.Configuration:
>> java.io.IOException: config()
>>         at
>> org.apache.hadoop.conf.Configuration.<init>(Configuration.java:211)
>>         at
>> org.apache.hadoop.conf.Configuration.<init>(Configuration.java:198)
>>         at org.apache.hadoop.hbase.client.Scan.createForName(Scan.java:504)
>>         at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:524)
>>         at
>> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:555)
>>         at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:127)
>>         at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:978)
>>         at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:946)
>>         at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522)
>>         at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>         at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>> On 14/01/12 11:28, Joel Halbert wrote:
>>> This problem appears to be unrelated to my use of a scanner, or the client
>>> code.
>>>
>>> If in the hbase shell I run
>>>
>>> count 'table'
>>>
>>> it also gets stuck, at around record number  10,000,
>>>
>>> Is this a corrupted table? Is there any way to repair?
>>>
>>>
>>>
>>> On 13/01/12 23:03, Joel Halbert wrote:
>>>> It always hangs waiting on the same record....
>>>>
>>>>
>>>> On 13/01/12 22:48, Joel Halbert wrote:
>>>>> Successfully got a few thousand results....nothing exceptional in the
>>>>> hbase log:
>>>>>
>>>>> |2012-01-13 22:42:13,830  INFO org.apache.hadoop.io.compress.CodecPool:
>>>>>   Got  brand-new  decompressor
>>>>> 2012-01-13 22:42:13,832  INFO org.apache.hadoop.io.compress.CodecPool:
>>>>>   Got  brand-new  decompressor
>>>>> 2012-01-13 22:42:32,580  DEBUG
>>>>> org.apache.hadoop.hbase.io.hfile.LruBlockCache:  LRUStats:  total=332.03
>>>>>   MB,  free=61.32  MB,  max=393.35  MB,  blocks=1524,  accesses=720942,
>>>>>   hits=691565,  hitRatio=95.92%%,  cachingAccesses=720938,
>>>>>   cachingHits=691565,  cachingHitsRatio=95.92%%,  evictions=149,
>>>>>   evicted=27849,  evictedPerRun=186.90603637695312
>>>>> 2012-01-13 22:42:36,222  DEBUG
>>>>> org.apache.hadoop.hbase.master.LoadBalancer:  Server  information:
>>>>>   localhost.localdomain,59902,1326492448413=15
>>>>> 2012-01-13 22:42:36,223  INFO
>>>>> org.apache.hadoop.hbase.master.LoadBalancer:  Skipping  load balancing.
>>>>>   servers=1  regions=15  average=15.0  mostloaded=15  leastloaded=15
>>>>> 2012-01-13 22:42:36,236  DEBUG
>>>>> org.apache.hadoop.hbase.master.CatalogJanitor:  Scanned  14  catalog
row(s)
>>>>>   and gc'd0  unreferenced parent region(s)|
>>>>>
>>>>>
>>>>>
>>>>> On 13/01/12 22:46, T Vinod Gupta wrote:
>>>>>> did u get any scan results at all?
>>>>>> check your region server and master hbase logs for any warnings..
>>>>>>
>>>>>> also, just fyi - the standalone version of hbase is not super stable.
i
>>>>>> have had many similar problems in the past. the distributed mode
is
>>>>>> much
>>>>>> much robust.
>>>>>>
>>>>>> thanks
>>>>>>
>>>>>> On Fri, Jan 13, 2012 at 2:36 PM, Joel Halbert<joel@su3analytics.com>
>>>>>>   wrote:
>>>>>>
>>>>>>> I have a standalone instance of HBASE (single instance, on localhost).
>>>>>>>
>>>>>>> After reading a few thousand records using a scanner my thread
is
>>>>>>> stuck
>>>>>>> waiting:
>>>>>>>
>>>>>>> "main" prio=10 tid=0x00000000016d4800 nid=0xf3a in Object.wait()
>>>>>>> [0x00007fbe96dc3000]
>>>>>>>    java.lang.Thread.State: WAITING (on object monitor)
>>>>>>>     at java.lang.Object.wait(Native Method)
>>>>>>>     at java.lang.Object.wait(Object.**java:503)
>>>>>>>     at org.apache.hadoop.hbase.ipc.**HBaseClient.call(HBaseClient.**
>>>>>>> java:757)
>>>>>>>     - locked<0x00000007e2ba21d0>    (a org.apache.hadoop.hbase.ipc.**
>>>>>>> HBaseClient$Call)
>>>>>>>     at org.apache.hadoop.hbase.ipc.**HBaseRPC$Invoker.invoke(**
>>>>>>> HBaseRPC.java:257)
>>>>>>>     at $Proxy4.next(Unknown Source)
>>>>>>>     at org.apache.hadoop.hbase.**client.ScannerCallable.call(**
>>>>>>> ScannerCallable.java:79)
>>>>>>>     at org.apache.hadoop.hbase.**client.ScannerCallable.call(**
>>>>>>> ScannerCallable.java:38)
>>>>>>>     at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>>> HConnectionImplementation.**getRegionServerWithRetries(**
>>>>>>> HConnectionManager.java:1019)
>>>>>>>     at org.apache.hadoop.hbase.**client.MetaScanner.metaScan(**
>>>>>>> MetaScanner.java:182)
>>>>>>>     at org.apache.hadoop.hbase.**client.MetaScanner.metaScan(**
>>>>>>> MetaScanner.java:95)
>>>>>>>     at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>>> HConnectionImplementation.**prefetchRegionCache(**
>>>>>>> HConnectionManager.java:649)
>>>>>>>     at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>>> HConnectionImplementation.**locateRegionInMeta(**
>>>>>>> HConnectionManager.java:703)
>>>>>>>     - locked<0x00000007906dfcf8>    (a java.lang.Object)
>>>>>>>     at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>>>
>>>>>>> HConnectionImplementation.**locateRegion(**HConnectionManager.java:594)
>>>>>>>     at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>>>
>>>>>>> HConnectionImplementation.**locateRegion(**HConnectionManager.java:559)
>>>>>>>     at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>>> HConnectionImplementation.**getRegionLocation(**
>>>>>>> HConnectionManager.java:416)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hbase.**client.ServerCallable.**instantiateServer(
>>>>>>> **ServerCallable.java:57)
>>>>>>>     at org.apache.hadoop.hbase.**client.ScannerCallable.**
>>>>>>> instantiateServer(**ScannerCallable.java:63)
>>>>>>>     at org.apache.hadoop.hbase.**client.HConnectionManager$**
>>>>>>> HConnectionImplementation.**getRegionServerWithRetries(**
>>>>>>> HConnectionManager.java:1018)
>>>>>>>     at org.apache.hadoop.hbase.**client.HTable$ClientScanner.**
>>>>>>> nextScanner(HTable.java:1104)
>>>>>>>     at org.apache.hadoop.hbase.**client.HTable$ClientScanner.**
>>>>>>> next(HTable.java:1196)
>>>>>>>     at org.apache.hadoop.hbase.**client.HTable$ClientScanner$1.**
>>>>>>> hasNext(HTable.java:1256)
>>>>>>>     at crawler.cache.PageCache.**accept(PageCache.java:254)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Concretely, it is stuck on the iterator.next method:
>>>>>>>
>>>>>>>         Scan scan = new Scan(Bytes.toBytes(**hostnameTarget),
>>>>>>> Bytes.toBytes(hostnameTarget + (char) 127));
>>>>>>>         scan.setMaxVersions(1);
>>>>>>>         scan.setCaching(4);
>>>>>>>         ResultScanner resscan = table.getScanner(scan);
>>>>>>>         Iterator<Result>    it = resscan.iterator();
>>>>>>>         while (it.hasNext()) {              // stuck here
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Any clues?
>>>>>>>
>>>>>


Mime
View raw message