hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rita <rmorgan...@gmail.com>
Subject Re: tar or hadoop archive
Date Mon, 27 Jun 2011 23:36:15 GMT
So, it does an index of the file?



On Mon, Jun 27, 2011 at 10:10 AM, Joey Echeverria <joey@cloudera.com> wrote:

> The advantage of a hadoop archive files is it lets you access the
> files stored in it directly. For example, if you archived three files
> (a.txt, b.txt, c.txt) in an archive called foo.har. You could cat one
> of the three files using the hadoop command line:
>
> hadoop fs -cat har:///user/joey/out/foo.har/a.txt
>
> You can also copy files out of the archive or use files in the archive
> as input to map reduce jobs.
>
> -Joey
>
> On Mon, Jun 27, 2011 at 3:06 AM, Rita <rmorgan466@gmail.com> wrote:
> > We use hadoop/hdfs to archive data. I archive a lot of file by creating
> one
> > large tar file and then placing to hdfs. Is it better to use hadoop
> archive
> > for this or is it essentially the same thing?
> >
> > --
> > --- Get your facts first, then you can distort them as you please.--
> >
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>



-- 
--- Get your facts first, then you can distort them as you please.--

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message