hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piyush Mukati <piyush.muk...@gmail.com>
Subject merging small files in HDFS
Date Thu, 03 Nov 2016 12:53:46 GMT
I want to merge multiple files in one HDFS dir to one file. I am planning
to write a map only job using input format which will create only one
inputSplit per dir.
this way my job don't need to do any shuffle/sort.(only read and write back
to disk)
Is there any such file format already implemented ?
Or any there better solution for the problem.


View raw message