hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Alten-Lorenz <wget.n...@gmail.com>
Subject Re: recombining split files after data is processed
Date Mon, 23 Feb 2015 07:05:09 GMT

You can use an single reducer (http://wiki.apache.org/hadoop/HowManyMapsAndReduces <http://wiki.apache.org/hadoop/HowManyMapsAndReduces>)
for smaller datasets, or ‚getmerge‘: hadoop dfs -getmerge /hdfs/path local_file_name


> On 23 Feb 2015, at 08:00, Jonathan Aquilina <jaquilina@eagleeyet.net> wrote:
> Hey all,
> I understand that the purpose of splitting files is to distribute the data to multiple
core and task nodes in a cluster. My question is that after the output is complete is there
a way one can combine all the parts into a single file?
> -- 
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T

View raw message