flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Olson <jo4...@outlook.com>
Subject UnknownKvStateKeyGroupLocation
Date Tue, 16 May 2017 12:28:58 GMT
When running Flink in high availability mode, I've been seeing a high number of UnknownKvStateKeyGroupLocation
errors being returned when using queryable state calls.

If I put a simple getKvState call into a loop executing every second, and call it repeatedly,
sometimes I will get the expected results, sometimes I will get UnknownKvStateKeyGroupLocation
thrown. This is not associated with a query timeout (network issue).

>From looking at the Flink source code, this problem stems from a failure of lookup.getKvStateServerAddress
returning null. I know all the task managers are registering state with the job manager, because
I see the "Key value state registered for job xx under name yy" messages in the job server

Anything else I should be looking for? I have several jobs I am querying state on, and this
seems isolated to only one. I've gone over very closely the difference between the jobs, but
they all built from the same template.

What would cause a lookup.getKvStateServerAddress to sometimes succeed, and sometimes to fail?

View raw message