hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Natarajan, Prabakaran 1. (NSN - IN/Bangalore)" <prabakaran.1.natara...@nsn.com>
Subject Multiple Part files
Date Thu, 17 Jul 2014 07:52:31 GMT

After Map Reduce job, we are seeing multiple small part files in the output directory. We
are using RC file format (snappy codec)

1)      Do each part file will take 64MB block size?
2)      How to merge these multiple RC format part files into one RC file?
3)      What is the pros-cons of having multiple part files?
4)      Do merging part files will improve performance?

Thanks and Regards
Prabakaran.N  aka NP
nsn, Bangalore
When "I" is replaced by "We" - even Illness becomes "Wellness"

View raw message