Idle state retention is just making a trade-off between the accuracy and the storage consumption. It can meet part of the calculation requirements in the stream environment, but not all. For instance, in your use case, if there exists a TTL for each article, their praise states can be safely removed after a period of time. Otherwise, inconsistencies are unavoidable.
We admit that there should be other state retention mechanisms which can be applied in different scenarios. However, for now, setting a larger retention time or simply omitting this config seems to be the only choices.
Is the behavior a bit weird? Because it leads to data inconsistency.
In the given example, article_id 123 will always remain in the external storage. The state is removed and hence it cannot be retracted anymore.
Once the state was removed and the count reaches 10, a second record for article_id 123 will be emitted to the data store.
As soon as you enable state retention and state is needed that was removed, the query result can become inconsistent.