hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim R. Wilson" <wilson.ji...@gmail.com>
Subject Re: [core-user] Help deflating output files
Date Thu, 05 Jun 2008 00:39:23 GMT
Has someone already written a generic deflator program?  It would be a
great util to add to the core :)

-- Jim

On Wed, Jun 4, 2008 at 7:27 PM, Runping Qi <runping@yahoo-inc.com> wrote:
>
> You can run another map-only job to read convert the deflated files and
> write them out in the format you want.
>
> Runping
>
>
>> -----Original Message-----
>> From: Jim R. Wilson [mailto:wilson.jim.r@gmail.com]
>> Sent: Wednesday, June 04, 2008 4:13 PM
>> To: core-user@hadoop.apache.org
>> Subject: [core-user] Help deflating output files
>>
>> Hi all,
>>
>> I'm using hadoop-streaming to execute Python jobs in an EC2 cluster.
>> The output directory in HDFS has part-00000.deflate files - how can I
>> deflate them back into regular text?
>>
>> In my hadoop-site.xml, I unfortunately have:
>> <property>
>>   <name>mapred.output.compress</name>
>>   <value>true</value>
>> </property>
>> <property>
>>   <name>mapred.output.compression.type</name>
>>   <value>BLOCK</value>
>> </property>
>>
>> Of course, I could re-build my AMI's without this option, but is there
>> some way I can read my deflate files without going through that
>> hassle?  I'm hoping there's a command-line program to read these files
>> since I'm none of my code is Java.
>>
>> Thanks in advance for any help. :)
>>
>> -- Jim R. Wilson (jimbojw)
>

Mime
View raw message