Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B021391BB for ; Fri, 13 Apr 2012 06:37:06 +0000 (UTC) Received: (qmail 37526 invoked by uid 500); 13 Apr 2012 06:37:06 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 37400 invoked by uid 500); 13 Apr 2012 06:37:06 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 37353 invoked by uid 99); 13 Apr 2012 06:37:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Apr 2012 06:37:05 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Apr 2012 06:37:02 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 6A0C0368B59 for ; Fri, 13 Apr 2012 06:36:42 +0000 (UTC) Date: Fri, 13 Apr 2012 06:36:42 +0000 (UTC) From: "tim.wu (Updated) (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <1138310284.20871.1334299002528.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <735814315.20820.1334297954050.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (HADOOP-8274) In pseudo or cluster model under Cygwin, tasktracker can not create a new job because of symlink problem. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tim.wu updated HADOOP-8274: --------------------------- Description: The standalone model is ok. But, in pseudo or cluster model, it always throw errors, even I just run wordcount example. The HDFS works fine, but tasktracker can not create threads(jvm) for new job. It is empty under /logs/userlogs/job-xxxx/attempt-xxxx/. The reason looks like that in windows, Java can not recognize a symlink of folder as a folder. The detail description is as following, ====================================================================================================== First, the error log of tasktracker is like: ====================== 12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201203280212_0005_m_-1386636958 12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner jvm_201203280212_0005_m_-1386636958 spawned. 12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed jvm_201203280212_0005_m_-1386636958 but just removed 12/03/28 14:35:17 INFO mapred.JvmManager: JVM : jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of tasks it ran: 0 12/03/28 14:35:17 WARN mapred.TaskRunner: attempt_201203280212_0005_m_000002_0 : Child Error java.io.IOException: Task process exit with nonzero status of -1. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258) 12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots : 2 12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201203280212_0005_m_000002_1 task's state:UNASSIGNED 12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch : attempt_201203280212_0005_m_000002_1 which needs 1 slots 12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201203280212_0005_m_000002_1 which needs 1 slots 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for task: attempt_201203280212_0005_m_000002_0 java.io.FileNotFoundException: D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index (The system cannot find the path specified) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream.java:120) at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102) at org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188) at org.apache.hadoop.mapred.TaskLog$Reader.(TaskLog.java:423) at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81) at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for task: attempt_201203280212_0005_m_000002_0 java.io.FileNotFoundException: D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index (The system cannot find the path specified) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream.java:120) at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102) at org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188) at org.apache.hadoop.mapred.TaskLog$Reader.(TaskLog.java:423) at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81) at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) ======================================= I've tried to remote debug tasktracker. In org.apache.hadoop.mapredTaskLog.createTaskAttemptLogDir(TaskAttemptID, boolean, String[]) line: 97: public static void createTaskAttemptLogDir(TaskAttemptID taskID, boolean isCleanup, String[] localDirs) throws IOException{ String cleanupSuffix = isCleanup ? ".cleanup" : ""; String strAttemptLogDir = getTaskAttemptLogDir(taskID, cleanupSuffix, localDirs); File attemptLogDir = new File(strAttemptLogDir); if (!attemptLogDir.mkdirs()) { throw new IOException("Creation of " + attemptLogDir + " failed."); } String strLinkAttemptLogDir = getJobDir(taskID.getJobID()).getAbsolutePath() + File.separatorChar + taskID.toString() + cleanupSuffix; if (FileUtil.symLink(strAttemptLogDir, strLinkAttemptLogDir) != 0) { throw new IOException("Creation of symlink from " + strLinkAttemptLogDir + " to " + yestrAttemptLogDir + " failed."); } //Set permissions for target attempt log dir FsPermission userOnly = new FsPermission((short) 0777); //FsPermission userOnly = new FsPermission((short) 0700); FileUtil.setPermission(attemptLogDir, userOnly); } and symlink() function public static int symLink(String target, String linkname) throws IOException{ String cmd = "ln -s " + target + " " + linkname; Process p = Runtime.getRuntime().exec(cmd, null); int returnVal = -1; try{ returnVal = p.waitFor(); } catch(InterruptedException e){ //do nothing as of yet } if (returnVal != 0) { LOG.warn("Command '" + cmd + "' failed " + returnVal + " with: " + copyStderr(p)); } return returnVal; } we know hadoop will create a log folder in ${hadoop.tmp.dir}, and then invoke "ln -s " to create its symlink under /logs/userlog/job-xxx/attermp-xxxx. In my case, strLinkAttemptLogDir = D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1 strAttemptLogDir=/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1 After a subtrack is created by tasktracker, it runs error in the following function: in org.apache.hadoop.mapred.java , DefaultTaskController.launchTask(String, String, String, List, List, File, String, String) line: 107 ............... //mkdir the loglocation String logLocation = TaskLog.getAttemptDir(jobId, attemptId).toString(); if (!localFs.mkdirs(new Path(logLocation))) { throw new IOException("Mkdirs failed to create " + logLocation); } .............. mkdir() return false, because logLocation is a symlink file. In my case, it is ogLocation=D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1. If I open it from explorer in windows, it is just a file, but not a folder or shortcut. And its content is like, /tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1 Because the mkdir() is public boolean mkdirs(Path f) throws IOException { Path parent = f.getParent(); File p2f = pathToFile(f); return (parent == null || mkdirs(parent)) && (p2f.mkdir() || p2f.isDirectory()); } So, p2f.isDirectory returns false. And p2f.isFile will return true. So, for java, it is a file. Hence, IOException("Mkdirs failed to create D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000001_1") will be throws in child threads, and return -1. Then, we will get the above exception in main thread. Is it any way to close this symlink? Or any other way I can follow? BTW, in core-site.xml, I set hadoop.tmp.dir = /tmp/hadoop-${user.name}, and my $User.name is timwu. So, it should create a tmp folder /tmp/hadoop-timwu under cygwin's. However, in deed , it create a folder of d:/tmp/hadoop-timwu. If in cygwin, it is /cygdriver/d/tmp/hadoop-timwu. Is it correct? was: The standalone model is ok. But, in pseudo or cluster model, it example always throw errors, even I just run wordcount example. The HDFS works fine, but tasktracker can not create threads(jvm) for new job. It is empty under /logs/userlogs/job-xxxx/attempt-xxxx/. The reason looks like that in windows, Java can not recognize a symlink of folder as a folder. The detail description is as following, ====================================================================================================== First, the error log of tasktracker is like: ====================== 12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201203280212_0005_m_-1386636958 12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner jvm_201203280212_0005_m_-1386636958 spawned. 12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed jvm_201203280212_0005_m_-1386636958 but just removed 12/03/28 14:35:17 INFO mapred.JvmManager: JVM : jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of tasks it ran: 0 12/03/28 14:35:17 WARN mapred.TaskRunner: attempt_201203280212_0005_m_000002_0 : Child Error java.io.IOException: Task process exit with nonzero status of -1. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258) 12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots : 2 12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201203280212_0005_m_000002_1 task's state:UNASSIGNED 12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch : attempt_201203280212_0005_m_000002_1 which needs 1 slots 12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201203280212_0005_m_000002_1 which needs 1 slots 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for task: attempt_201203280212_0005_m_000002_0 java.io.FileNotFoundException: D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index (The system cannot find the path specified) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream.java:120) at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102) at org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188) at org.apache.hadoop.mapred.TaskLog$Reader.(TaskLog.java:423) at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81) at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for task: attempt_201203280212_0005_m_000002_0 java.io.FileNotFoundException: D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index (The system cannot find the path specified) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream.java:120) at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102) at org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188) at org.apache.hadoop.mapred.TaskLog$Reader.(TaskLog.java:423) at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81) at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) ======================================= I've tried to remote debug tasktracker. In org.apache.hadoop.mapredTaskLog.createTaskAttemptLogDir(TaskAttemptID, boolean, String[]) line: 97: public static void createTaskAttemptLogDir(TaskAttemptID taskID, boolean isCleanup, String[] localDirs) throws IOException{ String cleanupSuffix = isCleanup ? ".cleanup" : ""; String strAttemptLogDir = getTaskAttemptLogDir(taskID, cleanupSuffix, localDirs); File attemptLogDir = new File(strAttemptLogDir); if (!attemptLogDir.mkdirs()) { throw new IOException("Creation of " + attemptLogDir + " failed."); } String strLinkAttemptLogDir = getJobDir(taskID.getJobID()).getAbsolutePath() + File.separatorChar + taskID.toString() + cleanupSuffix; if (FileUtil.symLink(strAttemptLogDir, strLinkAttemptLogDir) != 0) { throw new IOException("Creation of symlink from " + strLinkAttemptLogDir + " to " + yestrAttemptLogDir + " failed."); } //Set permissions for target attempt log dir FsPermission userOnly = new FsPermission((short) 0777); //FsPermission userOnly = new FsPermission((short) 0700); FileUtil.setPermission(attemptLogDir, userOnly); } and symlink() function public static int symLink(String target, String linkname) throws IOException{ String cmd = "ln -s " + target + " " + linkname; Process p = Runtime.getRuntime().exec(cmd, null); int returnVal = -1; try{ returnVal = p.waitFor(); } catch(InterruptedException e){ //do nothing as of yet } if (returnVal != 0) { LOG.warn("Command '" + cmd + "' failed " + returnVal + " with: " + copyStderr(p)); } return returnVal; } we know hadoop will create a log folder in ${hadoop.tmp.dir}, and then invoke "ln -s " to create its symlink under /logs/userlog/job-xxx/attermp-xxxx. In my case, strLinkAttemptLogDir = D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1 strAttemptLogDir=/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1 After a subtrack is created by tasktracker, it runs error in the following function: in org.apache.hadoop.mapred.java , DefaultTaskController.launchTask(String, String, String, List, List, File, String, String) line: 107 ............... //mkdir the loglocation String logLocation = TaskLog.getAttemptDir(jobId, attemptId).toString(); if (!localFs.mkdirs(new Path(logLocation))) { throw new IOException("Mkdirs failed to create " + logLocation); } .............. mkdir() return false, because logLocation is a symlink file. In my case, it is ogLocation=D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1. If I open it from explorer in windows, it is just a file, but not a folder or shortcut. And its content is like, /tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1 Because the mkdir() is public boolean mkdirs(Path f) throws IOException { Path parent = f.getParent(); File p2f = pathToFile(f); return (parent == null || mkdirs(parent)) && (p2f.mkdir() || p2f.isDirectory()); } So, p2f.isDirectory returns false. And p2f.isFile will return true. So, for java, it is a file. Hence, IOException("Mkdirs failed to create D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000001_1") will be throws in child threads, and return -1. Then, we will get the above exception in main thread. Is it any way to close this symlink? Or any other way I can follow? BTW, in core-site.xml, I set hadoop.tmp.dir = /tmp/hadoop-${user.name}, and my $User.name is timwu. So, it should create a tmp folder /tmp/hadoop-timwu under cygwin's. However, in deed , it create a folder of d:/tmp/hadoop-timwu. If in cygwin, it is /cygdriver/d/tmp/hadoop-timwu. Is it correct? Summary: In pseudo or cluster model under Cygwin, tasktracker can not create a new job because of symlink problem. (was: In pseudo or cluster model Under cygwin, tasktracker can not create a new job because of symlink problem.) > In pseudo or cluster model under Cygwin, tasktracker can not create a new job because of symlink problem. > --------------------------------------------------------------------------------------------------------- > > Key: HADOOP-8274 > URL: https://issues.apache.org/jira/browse/HADOOP-8274 > Project: Hadoop Common > Issue Type: Bug > Affects Versions: 0.20.205.0, 1.0.0, 1.0.1, 0.22.0 > Environment: windows7+cygwin 1.7.11-1+jdk1.6.0_31+hadoop 1.0.0 > Reporter: tim.wu > > The standalone model is ok. But, in pseudo or cluster model, it always throw errors, even I just run wordcount example. > The HDFS works fine, but tasktracker can not create threads(jvm) for new job. It is empty under /logs/userlogs/job-xxxx/attempt-xxxx/. > The reason looks like that in windows, Java can not recognize a symlink of folder as a folder. > The detail description is as following, > ====================================================================================================== > First, the error log of tasktracker is like: > ====================== > 12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201203280212_0005_m_-1386636958 > 12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner jvm_201203280212_0005_m_-1386636958 spawned. > 12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed jvm_201203280212_0005_m_-1386636958 but just removed > 12/03/28 14:35:17 INFO mapred.JvmManager: JVM : jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of tasks it ran: 0 > 12/03/28 14:35:17 WARN mapred.TaskRunner: attempt_201203280212_0005_m_000002_0 : Child Error > java.io.IOException: Task process exit with nonzero status of -1. > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258) > 12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots : 2 > 12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201203280212_0005_m_000002_1 task's state:UNASSIGNED > 12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch : attempt_201203280212_0005_m_000002_1 which needs 1 slots > 12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201203280212_0005_m_000002_1 which needs 1 slots > 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for task: attempt_201203280212_0005_m_000002_0 > java.io.FileNotFoundException: D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index (The system cannot find the path specified) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.(FileInputStream.java:120) > at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102) > at org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188) > at org.apache.hadoop.mapred.TaskLog$Reader.(TaskLog.java:423) > at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81) > at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) > at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) > at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835) > at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) > at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) > at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) > at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) > at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) > at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > at org.mortbay.jetty.Server.handle(Server.java:326) > at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) > at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) > at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) > at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) > 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for task: attempt_201203280212_0005_m_000002_0 > java.io.FileNotFoundException: D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index (The system cannot find the path specified) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.(FileInputStream.java:120) > at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102) > at org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188) > at org.apache.hadoop.mapred.TaskLog$Reader.(TaskLog.java:423) > at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81) > at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) > at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) > at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835) > at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) > at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) > at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) > at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) > at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) > at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > at org.mortbay.jetty.Server.handle(Server.java:326) > at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) > at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) > at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) > at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) > ======================================= > I've tried to remote debug tasktracker. In > org.apache.hadoop.mapredTaskLog.createTaskAttemptLogDir(TaskAttemptID, boolean, String[]) line: 97: > public static void createTaskAttemptLogDir(TaskAttemptID taskID, > boolean isCleanup, String[] localDirs) throws IOException{ > String cleanupSuffix = isCleanup ? ".cleanup" : ""; > String strAttemptLogDir = getTaskAttemptLogDir(taskID, > cleanupSuffix, localDirs); > File attemptLogDir = new File(strAttemptLogDir); > if (!attemptLogDir.mkdirs()) { > throw new IOException("Creation of " + attemptLogDir + " failed."); > } > String strLinkAttemptLogDir = > getJobDir(taskID.getJobID()).getAbsolutePath() + File.separatorChar + > taskID.toString() + cleanupSuffix; > if (FileUtil.symLink(strAttemptLogDir, strLinkAttemptLogDir) != 0) { > throw new IOException("Creation of symlink from " + > strLinkAttemptLogDir + " to " + yestrAttemptLogDir + > " failed."); > } > //Set permissions for target attempt log dir > FsPermission userOnly = new FsPermission((short) 0777); //FsPermission userOnly = new FsPermission((short) 0700); > FileUtil.setPermission(attemptLogDir, userOnly); > } > and symlink() function > public static int symLink(String target, String linkname) throws IOException{ > String cmd = "ln -s " + target + " " + linkname; > Process p = Runtime.getRuntime().exec(cmd, null); > int returnVal = -1; > try{ > returnVal = p.waitFor(); > } catch(InterruptedException e){ > //do nothing as of yet > } > if (returnVal != 0) { > LOG.warn("Command '" + cmd + "' failed " + returnVal + > " with: " + copyStderr(p)); > } > return returnVal; > } > we know hadoop will create a log folder in ${hadoop.tmp.dir}, and then invoke "ln -s " to create its symlink under /logs/userlog/job-xxx/attermp-xxxx. > In my case, > strLinkAttemptLogDir = D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1 > strAttemptLogDir=/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1 > After a subtrack is created by tasktracker, it runs error in the following function: > in org.apache.hadoop.mapred.java , DefaultTaskController.launchTask(String, String, String, List, List, File, String, String) line: 107 > ............... > //mkdir the loglocation > String logLocation = TaskLog.getAttemptDir(jobId, attemptId).toString(); > if (!localFs.mkdirs(new Path(logLocation))) { > throw new IOException("Mkdirs failed to create " > + logLocation); > } > .............. > mkdir() return false, because logLocation is a symlink file. In my case, it is ogLocation=D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1. If I open it from explorer in windows, it is just a file, but not a folder or shortcut. And its content is like, > /tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1 > Because the mkdir() is > public boolean mkdirs(Path f) throws IOException { > Path parent = f.getParent(); > File p2f = pathToFile(f); > return (parent == null || mkdirs(parent)) && > (p2f.mkdir() || p2f.isDirectory()); > } > So, p2f.isDirectory returns false. And p2f.isFile will return true. So, for java, it is a file. Hence, IOException("Mkdirs failed to create D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000001_1") > will be throws in child threads, and return -1. Then, we will get the above exception in main thread. > Is it any way to close this symlink? Or any other way I can follow? > BTW, in core-site.xml, I set hadoop.tmp.dir = /tmp/hadoop-${user.name}, and my $User.name is timwu. So, it should create a tmp folder /tmp/hadoop-timwu under cygwin's. However, in deed , it create a folder of d:/tmp/hadoop-timwu. If in cygwin, it is /cygdriver/d/tmp/hadoop-timwu. Is it correct? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira