Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DF7AD183E2 for ; Thu, 4 Jun 2015 18:06:38 +0000 (UTC) Received: (qmail 86057 invoked by uid 500); 4 Jun 2015 18:06:38 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 86001 invoked by uid 500); 4 Jun 2015 18:06:38 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 85989 invoked by uid 99); 4 Jun 2015 18:06:38 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jun 2015 18:06:38 +0000 Date: Thu, 4 Jun 2015 18:06:38 +0000 (UTC) From: "Tomas Hudik (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HADOOP-12063) tar.gz compression doesn't produce correct output MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Tomas Hudik created HADOOP-12063: ------------------------------------ Summary: tar.gz compression doesn't produce correct output Key: HADOOP-12063 URL: https://issues.apache.org/jira/browse/HADOOP-12063 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.4.0 Reporter: Tomas Hudik I'm not completely sure whether this is the right place to put this issue since Pig is involved, however, Pig leave decompression of tar.gz to hadoop-common. How to reproduce the issue: # simple file (file1) with arbitrary text lines put into in1 in HDFS # same file (file1) compressed by tar -cvzf file1.tar.gz file put into in2 in HDFS # issue simple pig commands in pig: {quote} raw = load 'in1/' USING TextLoader AS (line: bytearray); dump raw; {quote} run for both (compressed and uncompressed file) # in case of compressed version you will get strange 1st line {quote} a0000644000570000001440000000002512534073736011260 0ustar loadhadoopusersa ... {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)