hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: HBaseClient call doesn't timeout
Date Mon, 08 Mar 2010 19:41:43 GMT
So the hang is probably due to the WREs, the unknown scanner seems to
be an incompatibility issue although I remember looking at it before
releasing 0.20.3... I'll check it out.

J-D

On Mon, Mar 8, 2010 at 11:38 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> The hbase client hang issue happened between 0.20.1 client and 0.20.1
> server.
>
> If I switch client to 0.20.3 hbase jar, I see the following in region server
> log:
>
> 2010-03-08 10:31:53,571 ERROR [IPC Server handler 25 on 60020]
> regionserver.HRegionServer(844):
> org.apache.hadoop.hbase.UnknownScannerException: Name: -1
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1925)
>        at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>
> Thanks
>
> On Mon, Mar 8, 2010 at 10:40 AM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:
>
>> Ted,
>>
>> Are you using a mix of 0.20.3 and 0.20.1 like you said previously? If
>> so, your client here is .3 and the server is .1?
>>
>> WRT the WREs. like Stack said in a previous email they happen when
>> something went wrong like HDFS errors intermixed with region splits,
>> etc. We are currently working in improving HBase's reliability in the
>> face of such errors. To repair your table I suggest running the
>> bin/add_table.rb script (see the comments at the top of that file for
>> usage information). Else your clients will always get those errors and
>> will just keep retrying.
>>
>> J-D
>>
>> On Mon, Mar 8, 2010 at 10:23 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> > Hi,
>> > We use the following code to retrieve data from hbase 0.20.1 which didn't
>> > return after 30 minutes:
>> >
>> >        ResultScanner _scanner = _data;
>> >        try {
>> >          Result[] _results = _scanner.next(defaultPageSize);
>> >          updateResults(pTable, _scanner, _results);
>> >
>> > There is no exception thrown.
>> > Here is stack traces of the client:
>> >
>> > "IPC Client (47) connection to /10.10.31.135:60020 from an unknown user"
>> > daemon prio=10 tid=0x00000000560de800 nid=0x34bc runnable
>> > [0x000000004249b000]
>> >   java.lang.Thread.State: RUNNABLE
>> >        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>> >        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
>> >        at
>> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
>> >        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
>> >        - locked <0x00002aaaaffc9cb8> (a sun.nio.ch.Util$1)
>> >        - locked <0x00002aaaaffc9ca0> (a
>> > java.util.Collections$UnmodifiableSet)
>> >        - locked <0x00002aaaaffc9678> (a sun.nio.ch.EPollSelectorImpl)
>> >        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>> >        at
>> >
>> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:332)
>> >        at
>> >
>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
>> >        at
>> > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>> >        at
>> > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>> >        at java.io.FilterInputStream.read(FilterInputStream.java:116)
>> >        at
>> >
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:279)
>> >        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>> >        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>> >        - locked <0x00002aaaaffb2878> (a java.io.BufferedInputStream)
>> >        at java.io.DataInputStream.readInt(DataInputStream.java:370)
>> >        at
>> >
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:504)
>> >        at
>> >
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:448)
>> >
>> > "IPC Client (47) connection to /10.10.31.136:60020 from an unknown user"
>> > daemon prio=10 tid=0x00000000565c1800 nid=0x34bb in Object.wait()
>> > [0x000000004239a000]
>> >   java.lang.Thread.State: TIMED_WAITING (on object monitor)
>> >        at java.lang.Object.wait(Native Method)
>> >        at
>> >
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.waitForWork(HBaseClient.java:404)
>> >        - locked <0x00002aaaaffad298> (a
>> > org.apache.hadoop.hbase.ipc.HBaseClient$Connection)
>> >        at
>> >
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:447)
>> >
>> > "AWT-EventQueue-0" prio=10 tid=0x0000000056456800 nid=0x34a8 in
>> > Object.wait() [0x0000000041793000]
>> >   java.lang.Thread.State: WAITING (on object monitor)
>> >        at java.lang.Object.wait(Native Method)
>> >        at java.lang.Object.wait(Object.java:485)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:717)
>> >        - locked <0x00002aaaae2d8d48> (a
>> > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:328)
>> >        at $Proxy0.close(Unknown Source)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:101)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:72)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:988)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:1887)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:2033)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:2051)
>> >        at
>> >
>> net.kindsight.webmap.gui.HBaseViewerPanel.updateUI(HBaseViewerPanel.java:390)
>> >        at
>> >
>> net.kindsight.webmap.gui.HBaseViewerPanel.access$900(HBaseViewerPanel.java:45)
>> >        at
>> >
>> net.kindsight.webmap.gui.HBaseViewerPanel$5.run(HBaseViewerPanel.java:362)
>> >        at
>> java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:199)
>> >        at java.awt.EventQueue.dispatchEvent(EventQueue.java:597)
>> >        at
>> >
>> java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:269)
>> >        at
>> >
>> java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:184)
>> >        at
>> >
>> java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:174)
>> >        at
>> > java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:169)
>> >        at
>> > java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:161)
>> >        at java.awt.EventDispatchThread.run(EventDispatchThread.java:122)
>> >
>> > The regionserver on 10.10.31.136 ran fine.
>> > For regionserver on 10.10.31.135, apart from some WREs, things were
>> normal:
>> > 2010-03-08 10:21:16,791 ERROR [IPC Server handler 17 on 60020]
>> > regionserver.HRegionServer(844):
>> > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
>> out
>> > of range for HRegion
>> >
>> ruletable,com.hoovers.www\x2Fcompanyindex\x2FMichigan\x2FBattle_Creek\x2FMarketing_and_Advertising_Services-1.html,1267860376861,
>> >
>> startKey='com.hoovers.www\x2Fcompanyindex\x2FMichigan\x2FBattle_Creek\x2FMarketing_and_Advertising_Services-1.html',
>> >
>> getEndKey()='com.hoovers.www\x2Fcompanyindex\x2FNew_Mexico\x2FVado\x2FBroadcasting_Industry-1.html',
>> >
>> row='com.xmradio.www\x2FpadData\x2Fpad_data_servlet.jsp\x3Fchannel\x3D66\x26rpc\x3DXMROUS\x26rnd\x3D8378'
>> >  at
>> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1522)
>> >  at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1554)
>> >  at
>> org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:1622)
>> >  at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2278)
>> >  at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1785)
>> >  at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>> >  at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >  at java.lang.reflect.Method.invoke(Method.java:597)
>> >  at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>> >  at
>> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> >
>> > Has someone seen similar situation ?
>> > Thanks
>> >
>>
>

Mime
View raw message