I am trying to better understand shuffle in spark .
Based on my understanding thus far ,
Shuffle Write : writes stage output for intermediate stage on local disk if memory is not sufficient.,
Example , if each worker has 200 MB memory for intermediate results and the results are 300MB then , each executer will keep 200 MB in memory and will write remaining 100 MB on local disk .
Shuffle Read : Each executer will read from other executer's memory + disk , so total read in above case will be 300(200 from memory and 100 from disk)*num of executers ?
Is my understanding correct ?