hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leon Mergen <l...@solatis.com>
Subject libhdfs / gzip support
Date Mon, 19 Jul 2010 12:56:51 GMT

We're using Hadoop in a C-oriented architecture ourselves, using libhdfs for
storing files and Hadoop.Pipes for map/reduce jobs. Since the data we're
storing benefits a lot from compression, we're currently investigating ways
to do this.

Ideally we would perform block-level compression: each separate 64MB block
of data would be compressed. Hadoop.Pipes seems to provide a way to change
the InputReader and OutputReader to enable the GzipCodec, however, I did not
find a good way to tell libhdfs to store files compressed.

Anyone has any experience with this, and/or ideas how to best approach this

We're using Hadoop 0.20.2


Leon Mergen

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message