hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lukavský <jan.lukav...@firma.seznam.cz>
Subject ProcFsBasedProcessTree and clean pages in smaps
Date Thu, 04 Feb 2016 13:11:34 GMT

I have a question about the way LinuxResourceCalculatorPlugin calculates 
memory consumed by process tree (it is calculated via 
ProcfsBasedProcessTree class). When we enable caching (disk) in apache 
spark jobs run on YARN cluster, the node manager starts to kill the 
containers while reading the cached data, because of "Container is 
running beyond memory limits ...". The reason is that even if we enable 
parsing of the smaps file 
the ProcfsBasedProcessTree calculates mmaped read-only pages as consumed 
by the process tree, while spark uses FileChannel.map(MapMode.READ_ONLY) 
to read the cached data. The JVM then consumes *a lot* more memory than 
the configured heap size (and it cannot be really controlled), but this 
memory is IMO not really consumed by the process, the kernel can reclaim 
these pages, if needed. My question is - is there any explicit reason 
why "Private_Clean" pages are calculated as consumed by process tree? I 
patched the ProcfsBasedProcessTree not to calculate them, but I don't 
know if this is the "correct" solution.

Thanks for opinions,

To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org

View raw message