hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Fiser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15757) Reverse scan fails with no obvious cause
Date Fri, 06 May 2016 08:37:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273799#comment-15273799
] 

Robert Fiser commented on HBASE-15757:
--------------------------------------

There is RS log:
		16/05/06 08:19:46 INFO regionserver.HRegion: Starting compaction on o in region <REPLACED_TABLE_NAME>,156483584365269_775211065825848,1452787869225.f4a36bfadc9903043191313f85d9c99e.
		16/05/06 08:19:46 INFO regionserver.HStore: Starting compaction of 3 file(s) in o of <REPLACED_TABLE_NAME>,156483584365269_775211065825848,1452787869225.f4a36bfadc9903043191313f85d9c99e.
into tmpdir=hdfs://sencha/hbase/data/default/<REPLACED_TABLE_NAME>/f4a36bfadc9903043191313f85d9c99e/.tmp,
totalSize=159.4 M
		16/05/06 08:19:50 INFO regionserver.HRegionServer: Scanner 42329759 lease expired on region
<REPLACED_TABLE_NAME>,76469923274_199526018274\x00\x7F\xFF\xFE\xADb\xE9\xFB\x1F,1454587404590.3a2da3ba070db86dc666337cf6c94f38.
		16/05/06 08:19:51 INFO regionserver.HRegionServer: Scanner 42329808 lease expired on region
<REPLACED_TABLE_NAME>,76469923274_199526018274\x00\x7F\xFF\xFE\xADb\xE9\xFB\x1F,1454587404590.3a2da3ba070db86dc666337cf6c94f38.
		16/05/06 08:19:52 INFO regionserver.HRegionServer: Scanner 42329872 lease expired on region
<REPLACED_TABLE_NAME>,76469923274_199526018274\x00\x7F\xFF\xFE\xADb\xE9\xFB\x1F,1454587404590.3a2da3ba070db86dc666337cf6c94f38.
		16/05/06 08:19:54 INFO regionserver.HRegionServer: Scanner 42330004 lease expired on region
<REPLACED_TABLE_NAME>,76469923274_199526018274\x00\x7F\xFF\xFE\xADb\xE9\xFB\x1F,1454587404590.3a2da3ba070db86dc666337cf6c94f38.
		16/05/06 08:19:54 INFO regionserver.HRegionServer: Scanner 42329973 lease expired on region
<REPLACED_TABLE_NAME>,156185069843_10153620394269844\x00\x7F\xFF\xFE\xB0\x05\xF9t\xD7,1454594676584.7f2688ce2f34b378485bc4376d83de66.
		16/05/06 08:19:55 INFO regionserver.HRegionServer: Scanner 42330016 lease expired on region
<REPLACED_TABLE_NAME>,156185069843_10153620394269844\x00\x7F\xFF\xFE\xB0\x05\xF9t\xD7,1454594676584.7f2688ce2f34b378485bc4376d83de66.
		16/05/06 08:19:56 INFO regionserver.HRegionServer: Scanner 42330057 lease expired on region
<REPLACED_TABLE_NAME>,76469923274_199526018274\x00\x7F\xFF\xFE\xADb\xE9\xFB\x1F,1454587404590.3a2da3ba070db86dc666337cf6c94f38.
		16/05/06 08:19:57 INFO regionserver.HRegionServer: Scanner 42330039 lease expired on region
<REPLACED_TABLE_NAME>,156185069843_10153620394269844\x00\x7F\xFF\xFE\xB0\x05\xF9t\xD7,1454594676584.7f2688ce2f34b378485bc4376d83de66.

We have a time in rowkey and if the timerange is very large it may hit a thousands of rows
even if we have implemented a limit(5000) to stop scanning there might be much more rows.
I've tried to set caching from 100 to 5000 with no result.
Now I've found that the shorten range between start and stop row causes no error at all. All
requests pass. If the range is large it cause error for scans even if they hits a less than
100 rows.

> Reverse scan fails with no obvious cause
> ----------------------------------------
>
>                 Key: HBASE-15757
>                 URL: https://issues.apache.org/jira/browse/HBASE-15757
>             Project: HBase
>          Issue Type: Bug
>          Components: Client, Scanners
>    Affects Versions: 0.98.12
>         Environment: ubuntu 14.04, amazon cloud; 10 datanodes d2.4xlarge - 16cores, 12x200GB
HDD, 122GB RAM
>            Reporter: Robert Fiser
>
> related issue on stackoverflow: http://stackoverflow.com/questions/37001169/hbase-reverse-scan-error?noredirect=1#comment61558097_37001169
> this works well:
>     scan = new Scan(startRow, stopRow);
> this throws exception sometimes:
>     scan = new Scan(stopRow, startRow);
> 	scan.setReversed(true);
> throwing exception while traffic is at least 100 req/s. there are actually no timeouts,
exception is fired immediately for 1-10% requests
> hbase: 0.98.12-hadoop2;
> hadoop: 2.7.0;
> cluster in AWS, 10 datanodes: d2.4xlarge
> I think it's maybe related with this issue but I'm not using any filters http://apache-hbase.679495.n3.nabble.com/Exception-during-a-reverse-scan-with-filter-td4069721.html
>     	java.lang.RuntimeException: org.apache.hadoop.hbase.DoNotRetryIOException: Failed
after retry of OutOfOrderScannerNextException: was there a rpc timeout?
> 			at org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(AbstractClientScanner.java:94)
> 			at com.socialbakers.broker.client.hbase.htable.AbstractHtableListScanner.scanToList(AbstractHtableListScanner.java:30)
> 			at com.socialbakers.broker.client.hbase.htable.AbstractHtableListSingleScanner.invokeOperation(AbstractHtableListSingleScanner.java:23)
> 			at com.socialbakers.broker.client.hbase.htable.AbstractHtableListSingleScanner.invokeOperation(AbstractHtableListSingleScanner.java:11)
> 			at com.socialbakers.broker.client.hbase.AbstractHbaseApi.endPointMethod(AbstractHbaseApi.java:40)
> 			at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
> 			at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 			at java.lang.reflect.Method.invoke(Method.java:497)
> 			at com.socialbakers.broker.client.Route.invoke(Route.java:241)
> 			at com.socialbakers.broker.client.handler.EndpointHandler.invoke(EndpointHandler.java:173)
> 			at com.socialbakers.broker.client.handler.EndpointHandler.process(EndpointHandler.java:69)
> 			at com.thetransactioncompany.jsonrpc2.server.Dispatcher.process(Dispatcher.java:196)
> 			at com.socialbakers.broker.client.RejectableRunnable.run(RejectableRunnable.java:38)
> 			at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 			at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 			at java.lang.Thread.run(Thread.java:745)
> 			Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of OutOfOrderScannerNextException:
was there a rpc timeout?
> 			at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:430)
> 			at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:333)
> 			at org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(AbstractClientScanner.java:91)
> 			... 15 more
> 			Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException:
Expected nextCallSeq: 2 But the nextCallSeq got from client: 1; request=scanner_id: 27700695
number_of_rows: 100 close_scanner: false next_call_seq: 1
> 			at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3231)
> 			at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:30946)
> 			at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2093)
> 			at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
> 			at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
> 			at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
> 			at java.lang.Thread.run(Thread.java:745)
> 			at sun.reflect.GeneratedConstructorAccessor16.newInstance(Unknown Source)
> 			at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 			at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> 			at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
> 			at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
> 			at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:287)
> 			at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:214)
> 			at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:58)
> 			at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:115)
> 			at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:91)
> 			at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:375)
> 			... 17 more
> 			Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException):
org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 2
But the nextCallSeq got from client: 1; request=scanner_id: 27700695 number_of_rows: 100 close_scanner:
false next_call_seq: 1
> 			at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3231)
> 			at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:30946)
> 			at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2093)
> 			at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
> 			at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
> 			at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
> 			at java.lang.Thread.run(Thread.java:745)
> 			at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
> 			at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
> 			at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
> 			at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:31392)
> 			at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:173)
> 			... 21 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message