hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhu weimin" <xim-...@tsm.kddilabs.jp>
Subject RE: libhdfs / gzip support
Date Tue, 20 Jul 2010 00:38:33 GMT
Hi

Libhdfs does not support to store files compressed.
But you can create a patch for it using the class of GZIPOutputStream.

weimin zhu

> -----Original Message-----
> From: Leon Mergen [mailto:leon@solatis.com]
> Sent: Monday, July 19, 2010 9:57 PM
> To: common-user@hadoop.apache.org
> Subject: libhdfs / gzip support
> 
> Hello,
> 
> We're using Hadoop in a C-oriented architecture ourselves, using libhdfs
> for
> storing files and Hadoop.Pipes for map/reduce jobs. Since the data we're
> storing benefits a lot from compression, we're currently investigating
ways
> to do this.
> 
> Ideally we would perform block-level compression: each separate 64MB block
> of data would be compressed. Hadoop.Pipes seems to provide a way to change
> the InputReader and OutputReader to enable the GzipCodec, however, I did
> not
> find a good way to tell libhdfs to store files compressed.
> 
> Anyone has any experience with this, and/or ideas how to best approach
this
> problem?
> 
> We're using Hadoop 0.20.2
> 
> Regards,
> 
> Leon Mergen



Mime
View raw message