hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suraj Varma <svarma...@gmail.com>
Subject Re: "Child" processes not getting killed
Date Thu, 09 Dec 2010 22:44:26 GMT
Take a thread dump on those child processes before killing them. Use jstack
for instance and take a thread dump, wait for 30 secs and take another one.
That should tell you what they are waiting on.
--Suraj

On Thu, Dec 9, 2010 at 1:53 AM, Hari Sreekumar <hsreekumar@clickable.com>wrote:

> Tried that, nope, didn't help much. I was opening a table and scanning in
> the reducer. Now I am calling scanner.close() in each reducer and I have
> put
> HTable.close() in the cleanup() function too. Still seeing those children
> even after the job is killed :(
>
> I am using TableMapReduceUtil.initTableMapperJob(). Do I have to call close
> for that separately in the mapper too?
>
> hari
>
> On Wed, Dec 8, 2010 at 11:41 PM, Hari Sreekumar <hsreekumar@clickable.com
> >wrote:
>
> > That's exactly what I was thinking right now, and there's a high
> > probability of that, I'll check tomorrow and update here. Also we have
> > the jvm reuse parameter set to -1. Not sure if that would matter
> > though
> >
> > Thanks,
> > hari
> >
> > On Wednesday, December 8, 2010, Veeramachaneni, Ravi
> > <ravi.veeramachaneni@navteq.com> wrote:
> > > If not already, check your code, it is possible that you might have
> > missed on close call on Scanner that may lead to the connection still
> > hanging and so is the process(es).
> > >
> > > Ravi
> > > ________________________________________
> > > From: saint.ack@gmail.com [saint.ack@gmail.com] On Behalf Of Stack [
> > stack@duboce.net]
> > > Sent: Wednesday, December 08, 2010 11:08 AM
> > > To: user@hbase.apache.org
> > > Subject: Re: "Child" processes not getting killed
> > >
> > > Add some logging to your Map task and retry?
> > > St.Ack
> > >
> > > On Tue, Dec 7, 2010 at 10:28 PM, Hari Sreekumar
> > > <hsreekumar@clickable.com> wrote:
> > >> Hi Stack,
> > >>
> > >>          The logs don't show anything nasty. e.g, I ran a job which
> > spawned
> > >> 5 mappers. All of the Child processes spawned by them remained even
> > after
> > >> the job completed. 3 map tasks got completed, and they have the
> > following
> > >> log:
> > >>
> > >> *stdout logs*
> > >>
> > >> ------------------------------
> > >>
> > >>
> > >> *stderr logs*
> > >>
> > >> ------------------------------
> > >>
> > >>
> > >> *syslog logs*
> > >>
> > >> 2010-12-08 11:43:28,358 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> > >> Initializing JVM Metrics with processName=MAP, sessionId=
> > >> 2010-12-08 11:43:28,687 INFO org.apache.zookeeper.ZooKeeper: Client
> > >> environment:zookeeper.version=3.2.2-888565, built on 12/08/2009 21:51
> > >> GMT
> > >> 2010-12-08 11:43:28,687 INFO org.apache.zookeeper.ZooKeeper: Client
> > >> environment:host.name=hadoop1
> > >> 2010-12-08 11:43:28,687 INFO org.apache.zookeeper.ZooKeeper: Client
> > >> environment:java.version=1.6.0_22
> > >> 2010-12-08 11:43:28,687 INFO org.apache.zookeeper.ZooKeeper: Client
> > >> environment:java.vendor=Sun Microsystems Inc.
> > >> 2010-12-08 11:43:28,687 INFO org.apache.zookeeper.ZooKeeper: Client
> > >> environment:java.home=/usr/java/jdk1.6.0_22/jre
> > >> 2010-12-08 11:43:28,687 INFO org.apache.zookeeper.ZooKeeper: Client
> > >>
> >
> environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/..:/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/commons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/hbase-0.20.6.jar:/hbase-0.20.6-test.jar:/conf:/lib/zookeeper-3.2.2.jar::/home/hadoop/DFS/MultiNode/mapred/local/taskTracker/jobcache/job_201012071556_0020/jars/classes:/home/hadoop/DFS/MultiNode/mapred/local/taskTracker/jobcache/job_201012071556_0020/jars:/home/hadoop/DFS/MultiNode/mapred/local/taskTracker/jobcache/job_201012071556_0020/attempt_201012071556_0020_m_000000_0/work
> > >> 2010-12-08 11:43:28,688 INFO org.apache.zookeeper.ZooKeeper: Client
> > >>
> >
> environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64:/home/hadoop/DFS/MultiNode/mapred/local/taskTracker/jobcache/job_201012071556_0020/attempt_201012071556_0020_m_000000_0/work
> > >> 2010-12-08 11:43:28,688 INFO org.apache.zookeeper.ZooKeeper: Client
> > >>
> >
> environment:java.io.tmpdir=/home/hadoop/DFS/MultiNode/mapred/local/taskTracker/jobcache/job_201012071556_0020/attempt_201012071556_0020_m_000000_0/work/tmp
> > >> 2010-12-08 11:43:28,688 INFO org.apache.zookeeper.ZooKeeper: Client
> > >> environment:java.compiler=<NA>
> > >> 2010-12-08 11:43:28,688 INFO org.apache.zookeeper.ZooKeeper: Client
> > >> environment:os.name=Linux
> > >> 2010-12-08 11:43:28,688 INFO org.apache.zookeeper.ZooKeeper: Client
> > >> environment:os.arch=amd64
> > >> 2010-12-08 11:43:28,688 INFO org.apache.zookeeper.ZooKeeper: Client
> > >> environment:os.version=2.6.18-The information contained in this
> > communication may be CONFIDENTIAL and is intended only for the use of the
> > recipient(s) named above.  If you are not the intended recipient, you are
> > hereby notified that any dissemination, distribution, or copying of this
> > communication, or any of its contents, is strictly prohibited.  If you
> have
> > received this communication in error, please notify the sender and
> > delete/destroy the original message and any copy of it from your computer
> or
> > paper files.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message