I have removed 4 nodes with "nodetool decommission". 2 of them have left with no issue, while the 2 others nodes remained "leaving" even after streaming their data.
The only specific thing of these 2 nodes is that they had a lot of hints pending. Hints from a node that couldn't come back and that I removed earlier (because of the heavy load induced by Hinted Handoff while coming back, which induced a lot of latencies in our app. This node didn't manage to come back after 10 minutes, I removed it).
So there I faced 3 bugs (or problems) :
1 - At first, one of my node came down 5 min and when it came back it get flooded by Hinted Handoff so hard that it could not handle the real time queries properly. I haven't find a way to prioritize app queries rather than Hinted Handoff.
2 - Nodes keep hints for a node that has been removed.
3 - Nodes with 500MB to 3GB hints stored for a removed node can't be decommissioned, they stuck after streaming their data.
As solutions for this 3 issues I did the following:
Solution to 1 - I removed this down node (nodetool removenode)
Solution to 2 - Stop the node remove system hints
Solution to 3 - Stop the node and removenode instead of decommission
Now I have no more issue, yet I felt I had to report this. Maybe my experience can help users to get out of tricky situations and commiters to detect some issues, specially about hinted handoff.