Just to follow up on this issue: after collecting some data and setting up additional tests we have managed to pinpoint the issue to the the ScalaBuff-generated code that decodes enumerations. After switching to use ScalaPB generator instead, the problem was gone. 

One thing peculiar about this bug, however, is that it seems to manifest only on Flink. We have a number of ad-hoc streaming pipelines (without Flink) that are still using the very same decoder code and have been running for weeks without seemingly experiencing any memory or performance issues. The versions of Flink that we saw this happening this on are 1.0 and 1.0.1.


I suspect its a GC issue with the code generated by ScalaBuff. Can you maybe try to do something like a standalone test where use use a while(true) loop to see how fast you can deserialize elements from your Foo type?
Maybe you'll find that the JVM is growing all the time. Then there's probably a memory leak somewhere.

(1) Which Flink version are you using for this?

(2) Can you also get a heap dump after the job slows down? Slow downs
like this are often caused by some component leaking memory, maybe in
Flink, maybe the Scalabuff deserializer. Can you also share the Foo

