hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Watson <...@bwinsights.co.uk>
Subject HBase Coprocessor calls fail with RegionNotFoundException during region splits
Date Tue, 19 Feb 2019 15:37:33 GMT
Hello,

I’m running HBase 1.4.4. I’ve got a simple endpoint coprocessor that sums
records when called. Whenever a split occurs, it fails when called,
throwing a RegionNotFoundException. The error manifests itself by spending
10 minutes retrying the connection 35 times:

2019-02-19 09:42:34 INFO  o.a.h.h.c.RpcRetryingCaller
[hconnection-0x100f9a76-shared--pool3-t215]: Call exception, tries=25,
retries=35, started=331810 ms ago, cancelled=false,
msg=org.apache.hadoop.hbase.NotServingRegionException: Region
coprocessor-test,1,1550568604433.63f03f2a494dc5756238ba08af437af6. is not
online on <hostname>,16020,1550568101996

    at
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3082)

    at
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1275)

    at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2201)

    at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617)

    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2354)

    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)

    at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297)

    at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)

row '1_pfx-cfb0e548-f399-4059-af80-54fe9b7a828f' on table
'coprocessor-test' at
region=coprocessor-test,1_pfx-7b2b6071-7d2c-4282-9645-31ca027327dc6549,1550568988094.f6cc0c6245702c544fb7fe65c1e3299b.,
hostname=<hostname>l,16020,1550568101996, seqNum=630

before eventually failing:

Tue Feb 19 09:37:02 UTC 2019,
RpcRetryingCaller{globalStartTime=1550569022304, pause=100, retries=35},
org.apache.hadoop.hbase.NotServingRegionException:
org.apache.hadoop.hbase.NotServingRegionException: Region
coprocessor-test,9,1550568604433.2d98945e85cca401a2c5d8bd777a0451. is not
online on <hostname>,16020,1550568099593

        at
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3082)

        at
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1275)

        at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2201)

        at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617)

        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2354)

        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)

        at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297)

        at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)

If I then re-run the coprocessor, it works without any issues. So, I need a
way to quickly catch this error and manually retry it until it works. I
can't see a way to change any useful parameter – the 35 retries and the
time between retries seem to be hardcoded.

Can anyone suggest how I can go about solving this?

Regards,

Ben

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message