hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From madhu phatak <phatak....@gmail.com>
Subject Re: HDFS File Appending
Date Tue, 21 Jun 2011 10:11:30 GMT
HDFS doesnot support Appending i think . I m not sure about pig , if you are
using Hadoop directly you can zip the files and use zip as the input the
jobs.

On Fri, Jun 17, 2011 at 6:56 AM, Xiaobo Gu <guxiaobo1982@gmail.com> wrote:

> please refer to FileUtil.CopyMerge
>
> On Fri, Jun 17, 2011 at 8:33 AM, jagaran das <jagaran_das@yahoo.co.in>
> wrote:
> > Hi,
> >
> > We have a requirement where
> >
> >  There would be huge number of small files to be pushed to hdfs and then
> use pig
> > to do analysis.
> >  To get around the classic "Small File Issue" we merge the files and push
> a
> > bigger file in to HDFS.
> >  But we are loosing time in this merging process of our pipeline.
> >
> > But If we can directly append to an existing file in HDFS we can save
> this
> > "Merging Files" time.
> >
> > Can you please suggest if there a newer stable version of Hadoop where
> can go
> > for appending ?
> >
> > Thanks and Regards,
> > Jagaran
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message