Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F151A10E12 for ; Thu, 19 Sep 2013 01:04:52 +0000 (UTC) Received: (qmail 40044 invoked by uid 500); 19 Sep 2013 01:04:52 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 40016 invoked by uid 500); 19 Sep 2013 01:04:52 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 40007 invoked by uid 99); 19 Sep 2013 01:04:52 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Sep 2013 01:04:52 +0000 Date: Thu, 19 Sep 2013 01:04:52 +0000 (UTC) From: "shanyu zhao (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (YARN-1219) FSDownload changes file suffix making FileUtil.unTar() throw exception MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shanyu zhao updated YARN-1219: ------------------------------ Assignee: shanyu zhao > FSDownload changes file suffix making FileUtil.unTar() throw exception > ---------------------------------------------------------------------- > > Key: YARN-1219 > URL: https://issues.apache.org/jira/browse/YARN-1219 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 3.0.0, 2.1.1-beta > Reporter: shanyu zhao > Assignee: shanyu zhao > > While running a Hive join operation on Yarn, I saw exception as described below. This is caused by FSDownload copy the files into a temp file and change the suffix into ".tmp" before unpacking it. In unpack(), it uses FileUtil.unTar() which will determine if the file is "gzipped" by looking at the file suffix: > {code} > boolean gzipped = inFile.toString().endsWith("gz"); > {code} > To fix this problem, we can remove the ".tmp" in the temp file name. > Here is the detailed exception: > org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:240) > at org.apache.hadoop.fs.FileUtil.unTarUsingJava(FileUtil.java:676) > at org.apache.hadoop.fs.FileUtil.unTar(FileUtil.java:625) > at org.apache.hadoop.yarn.util.FSDownload.unpack(FSDownload.java:203) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:287) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira