hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arv Mistry" <...@kindsight.net>
Subject Writing compressed data to HDFS
Date Tue, 01 Jun 2010 14:28:01 GMT
Hi,

I have a java process that writes compressed data to the HDFS. The way I
am doing that is wrapping the FSDataOutputSTream with GZIPOutputStream
and calling the write() method i.e. something like

FSDataOutputSTream  out = fs.create(file);
gzip = new GZIPOutputStream(out);		
gzip.write("sss".getBytes("UTF8");

The file seems to get written ok. 

However, when I get the file out of HDFS and try to unzip it, it
complains;

gunzip: cs_1_20100601_120000_1275396891183.cgz: unknown suffix --
ignored

When I do 'file' it is recognized as 'gzip compressed data, from FAT
filesystem (MS-DOS, OS/2, NT)'

Any ideas? Appreciate any help.

Cheers Arv

Mime
View raw message