flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Davis <moda...@protonmail.com>
Subject Re: DataSet API: HBase ScannerTimeoutException and double Result processing
Date Mon, 25 Nov 2019 10:57:23 GMT
Hi Flavio,

>> When the resultScanner dies because of a timeout (this happens a lot when you have
backpressure and the time between 2 consecutive reads exceed the scanner timeout), the code
creates a new scanner and restart from where it was (starRow = currentRow).
>> So there should not be any duplicates (in theory), but this could be the root of
the problem..

Yes, you are right, the nextRecord() exception handling is responsible for the duplicate record

org.apache.hadoop.hbase.client.ScannerTimeoutException: 1038878ms passed since the last invocation,
timeout is currently set to 60000
at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:453)
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:371)
at org.apache.flink.addons.hbase.AbstractTableInputFormat.nextRecord(AbstractTableInputFormat.java:130)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:192)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException:
Name: 135281, already closed?
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2389)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32385)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)

But I am not sure that the handling of the HBase exception thrown from ClientScanner.next()
is correct.
If the call to mapResultToOutType(Result) finished without an error there is no need to restart
from the same row.
The new scanner should start from the next row.
Is that so or am I missing something?

Best regards,
View raw message