Looks you are assigning tasks to different slot sharing groups to force them to not share the same slot.
So you will need at least 2 slots for the streaming job to start running successfully.
Killing one of the 2 TM, one slot in each, will lead to insufficient slots and your job will hang at slot allocation.
Task states are needed to not skip unprocessed source data, thus to avoid data loss. It's also needed if you want the failed task to recovery to the state right before failure.
Checkpointing is needed to persist the task states. If it is not enabled, the job will restart with the initial state, i.e. the job will consume data from the very beginning and there can be a big data regression.