hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Giovanni Mascari <giovanni.masc...@polito.it>
Subject Re: merging small files in HDFS
Date Thu, 03 Nov 2016 13:53:43 GMT
Hi,
if I correctly understand your request you need only to merge some data 
resulting from an hdfs write operation.
In this case, I suppose that your best option is to use hadoop-stream 
with 'cat' command.

take a look here:
https://hadoop.apache.org/docs/r1.2.1/streaming.html

Regards

Il 03/11/2016 13:53, Piyush Mukati ha scritto:
> Hi,
> I want to merge multiple files in one HDFS dir to one file. I am 
> planning to write a map only job using input format which will create 
> only one inputSplit per dir.
> this way my job don't need to do any shuffle/sort.(only read and write 
> back to disk)
> Is there any such file format already implemented ?
> Or any there better solution for the problem.
>
> thanks.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org


Mime
View raw message