Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 47108 invoked from network); 4 Jun 2008 23:13:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 4 Jun 2008 23:13:29 -0000 Received: (qmail 80975 invoked by uid 500); 4 Jun 2008 23:13:29 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 80938 invoked by uid 500); 4 Jun 2008 23:13:29 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 80927 invoked by uid 99); 4 Jun 2008 23:13:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jun 2008 16:13:29 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of wilson.jim.r@gmail.com designates 209.85.128.185 as permitted sender) Received: from [209.85.128.185] (HELO fk-out-0910.google.com) (209.85.128.185) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jun 2008 23:12:40 +0000 Received: by fk-out-0910.google.com with SMTP id 26so253032fkx.13 for ; Wed, 04 Jun 2008 16:12:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:mime-version:content-type:content-transfer-encoding :content-disposition; bh=sI4Yp1veo5ZqoZWTLQFyi7fwt+1vzk1sGADnaX7eb3A=; b=kAgw+VcudH3fCB0YchyUBD2IYh/dW6qY0BWaQ3m+yhSLJK9UUYkFZmOZysXq8w/ZAe S9b+rBQZy2Ddaa/ADz0v+W4pEeSDtU0j7vuFEY2vZwYokyoq6wlLUKPLJGYw7EJMuRos 8esUU/7xWlIAoAtjmhEFyA9QY/KCCyRYOW57Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type :content-transfer-encoding:content-disposition; b=S8GUZq0wQUbYaj4tmRX2d+BK8OAdV+3ghhZ8d2gcmjhPgI2GTZf8OmH9boZpeWupl4 bgQVN2FnS3inikQayGatXtIRWByulz0AhXHsJTrXOP+G4g7G71RkMG7DwcQPDEec3Gyk FhzuKDsU/rBH+KjjlnGeQJ0UsaDqQpiCKawbU= Received: by 10.78.141.12 with SMTP id o12mr374216hud.47.1212621175643; Wed, 04 Jun 2008 16:12:55 -0700 (PDT) Received: by 10.78.118.2 with HTTP; Wed, 4 Jun 2008 16:12:55 -0700 (PDT) Message-ID: Date: Wed, 4 Jun 2008 18:12:55 -0500 From: "Jim R. Wilson" To: core-user@hadoop.apache.org Subject: [core-user] Help deflating output files MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Checked: Checked by ClamAV on apache.org Hi all, I'm using hadoop-streaming to execute Python jobs in an EC2 cluster. The output directory in HDFS has part-00000.deflate files - how can I deflate them back into regular text? In my hadoop-site.xml, I unfortunately have: mapred.output.compress true mapred.output.compression.type BLOCK Of course, I could re-build my AMI's without this option, but is there some way I can read my deflate files without going through that hassle? I'm hoping there's a command-line program to read these files since I'm none of my code is Java. Thanks in advance for any help. :) -- Jim R. Wilson (jimbojw)