I would recommend this: http://www.aosabook.org/en/hdfs.html


I want to learn a bit more when we need to change (increase/decrease) replication factor for better performance, and also want to learn a bit more internals about how replication factor works, and pros/cons for larger/smaller replication factors, for example, deploy static model/config file for Hadoop jobs, whether larger replication factor is better? Unfortunately, I cannot find related materials by search. Appreciate if anyone could point me some good documents.

