Hi 

I am trying to better understand shuffle in spark .

Based on my understanding thus far , 

Shuffle Write : writes stage output for intermediate stage on local disk if memory is not sufficient.,
Example , if each worker has 200 MB memory for intermediate results and the results are 300MB then , each executer will keep 200 MB in memory and will write remaining 100 MB on local disk .  

Shuffle Read : Each executer will read from other executer's memory + disk , so total read in above case will be 300(200 from memory and 100 from disk)*num of executers ?  

Is my understanding correct ?

Thanks,
Kartik