I don't know enough about the code level implementation to comment on the validity of the fix.  My main issue is that we use a lot of TTL columns and in many cases all columns have a TTL that is less than gc_grace.  The problem arises when the columns are gc-able and are compacted away on one node but not on all replicas, the periodic repair process ends up copying all the garbage columns & rows back to all other replicas.  It consumes a lot of repair resources and makes rows stick around for much longer than they really should which consumes even more cluster resources.
You can set gc_grace to 0 if you never manually delete any of them. You only need tombstones if you do manual deletes.

Otherwise the two tickets should improve your situation.