hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@maprtech.com>
Subject Re: Merging files
Date Sat, 22 Dec 2012 19:24:24 GMT
The technical term for this is "copying".  You may have heard of it.

It is a subject of such long technical standing that many do not consider
it worthy of detailed documentation.

Distcp effects a similar process and can be modified to combine the input
files into a single file.

http://hadoop.apache.org/docs/r1.0.4/distcp.html


On Sat, Dec 22, 2012 at 10:54 AM, Barak Yaish <barak.yaish@gmail.com> wrote:

> Can you please attach HOW-TO links for the alternatives you mentioned?
>
>
> On Sat, Dec 22, 2012 at 10:46 AM, Harsh J <harsh@cloudera.com> wrote:
>
>> Yes, via the simple act of opening a target stream and writing all
>> source streams into it. Or to save code time, an identity job with a
>> single reducer (you may not get control over ordering this way).
>>
>> On Sat, Dec 22, 2012 at 12:10 PM, Mohit Anchlia <mohitanchlia@gmail.com>
>> wrote:
>> > Is it possible to merge files from different locations from HDFS
>> location
>> > into one file into HDFS location?
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Mime
View raw message