Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 69874 invoked from network); 12 Jan 2007 03:50:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 Jan 2007 03:50:48 -0000 Received: (qmail 19926 invoked by uid 500); 12 Jan 2007 03:50:55 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 19903 invoked by uid 500); 12 Jan 2007 03:50:55 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 19894 invoked by uid 99); 12 Jan 2007 03:50:55 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Jan 2007 19:50:55 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Jan 2007 19:50:47 -0800 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 81A907142F7 for ; Thu, 11 Jan 2007 19:50:27 -0800 (PST) Message-ID: <16204682.1168573827513.JavaMail.jira@brutus> Date: Thu, 11 Jan 2007 19:50:27 -0800 (PST) From: "Nigel Daley (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-886) thousands of TimerThreads created by metrics API In-Reply-To: <12875481.1168573707546.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464092 ] Nigel Daley commented on HADOOP-886: ------------------------------------ Digging around some more, I found that this is the call stack that creates the threads. at org.apache.hadoop.metrics.spi.AbstractMetricsContext.startTimer(AbstractMetricsContext.java:239) at org.apache.hadoop.metrics.spi.AbstractMetricsContext.startMonitoring(AbstractMetricsContext.java:153) at org.apache.hadoop.metrics.file.FileContext.startMonitoring(FileContext.java:105) at org.apache.hadoop.metrics.Metrics.createRecord(Metrics.java:60) at org.apache.hadoop.mapred.ReduceTask$ReduceTaskMetrics.(ReduceTask.java:53) at org.apache.hadoop.mapred.ReduceTask.(ReduceTask.java:87) at org.apache.hadoop.mapred.TaskInProgress.getTaskToRun(TaskInProgress.java:527) at org.apache.hadoop.mapred.JobInProgress.obtainNewReduceTask(JobInProgress.java:359) at org.apache.hadoop.mapred.JobTracker.getNewTaskForTaskTracker(JobTracker.java:1200) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:992) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:337) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:538) > thousands of TimerThreads created by metrics API > ------------------------------------------------ > > Key: HADOOP-886 > URL: https://issues.apache.org/jira/browse/HADOOP-886 > Project: Hadoop > Issue Type: Bug > Components: metrics > Affects Versions: 0.10.1 > Reporter: Nigel Daley > > When running the smallJobsBenchmark with 180 maps and hadoop metrics logging to a file > (ie hadoop-metrics.properties file contains > dfs.class=org.apache.hadoop.metrics.file.FileContext > mapred.class=org.apache.hadoop.metrics.file.FileContext) > then I get this error: > org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:574) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:517) > at org.apache.hadoop.ipc.Client.call(Client.java:452) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:164) > at org.apache.hadoop.dfs.$Proxy0.isDir(Unknown Source) > at org.apache.hadoop.dfs.DFSClient.isDirectory(DFSClient.java:325) > at org.apache.hadoop.dfs.DistributedFileSystem.isDirectory(DistributedFileSystem.java:167) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:82) > at org.apache.hadoop.dfs.DistributedFileSystem.copyToLocalFile(DistributedFileSystem.java:222) > at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:842) > at org.apache.hadoop.mapred.JobInProgress.(JobInProgress.java:86) > at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1338) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:337) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:538) > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:258) > Using jconsole, I see that 2000+ of these threads were created: > Name: Timer-101 > State: TIMED_WAITING on java.util.TaskQueue@1501026 > Total blocked: 0 Total waited: 5 > Stack trace: > java.lang.Object.wait(Native Method) > java.util.TimerThread.mainLoop(Timer.java:509) > java.util.TimerThread.run(Timer.java:462) > The only use of the java.util.Timer API is in org.apache.hadoop.metrics.spi.AbstractMetricsContext. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira