Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CALte62y5AgdLym1SLPhXQO1rv3vKp7O22VPf0sAYYzcQZFAXOQ@mail.gmail.com>
References: 
 <CAAknh05RtmgYgOTVHFrMsSJQegt7qRdN46RmyhU8-kws_7t2pA@mail.gmail.com>
	<CALte62x1wn2E6NLC6T4_fAJTu-q0_imT1KpU4dqBe8wnDrXwwA@mail.gmail.com>
	<CAAknh05Mh4cYqvEYUJ1L+c1ouXP2g4Df_5TzT5HCh6URGbKH6w@mail.gmail.com>
	<CAAknh053QgSizYyR+yQLO7io2sLZru9EoAFtWG+2CNRV-4r9yg@mail.gmail.com>
	<CAAknh07-DMG5xJKPJkHO0cSgP_W5A7g5Y_bTWDYJEy4xCmnVCQ@mail.gmail.com>
	<CALte62wXjYDXySDjFe+Jqi1rJGMMhCXr7QQUiY5H1tWC4gRJPQ@mail.gmail.com>
	<CAAknh07RQN4zizXBSDB8eOwwD9gk7Aucmv_ck07qmwZRhbMZyQ@mail.gmail.com>
	<CALte62yzPDM-p1otQ69Av7f53bmi88+oCOMJOK2Vu1Xg2PjLgg@mail.gmail.com>
	<CAAknh06ds07S4dZ8G4Jxq55kFsFjArpmZLtCgH_CubTw8JsTiQ@mail.gmail.com>
	<CALte62y5AgdLym1SLPhXQO1rv3vKp7O22VPf0sAYYzcQZFAXOQ@mail.gmail.com>
Date: Sun, 17 May 2015 20:45:17 -0700
Message-ID: 
 <CALte62xt88CxPQ-zTnfW-0M4C_5y8XAoJmLpn533U=C94RqAGA@mail.gmail.com>
Subject: Re: HMaster restart with error
From: Ted Yu <yuzhihong@gmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org>
Content-Type: multipart/alternative; boundary=001a113cd73cb4a276051653059c

--001a113cd73cb4a276051653059c
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

After l-namenode1 became active master , it assigned regions:

2015-05-15 12:16:40,432 INFO  [master:l-namenode1:60000]
master.RegionStates: Transitioned {6f806bb62b347c992cd243fc909276ff
state=3DOFFLINE, ts=3D1431663400432, server=3Dnull} to
{6f806bb62b347c992cd243fc909276ff state=3DOPEN, ts=3D1431663400432, server=
=3D
l-hbase31.data.cn8.qunar.com,60020,1431462584879}

However, l-hbase31 went down:

2015-05-15 12:16:40,508 INFO
 [MASTER_SERVER_OPERATIONS-l-namenode1:60000-0]
handler.ServerShutdownHandler: Splitting logs for
l-hbase31.data.cn8.qunar.com,60020,1427789773001   before assignment.

l-namenode1 was restarted :

2015-05-15 12:20:25,322 INFO  [main] util.VersionInfo: HBase 0.96.0-hadoop2
2015-05-15 12:20:25,323 INFO  [main] util.VersionInfo: Subversion
https://svn.apache.org/repos/asf/hbase/branches/0.96 -r 1531434

However, it went down due to zookeeper session expiration:

2015-05-15 12:20:25,580 WARN  [main] zookeeper.ZooKeeperNodeTracker: Can't
get or delete the master znode
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode =3D Session expired for /hbase/master

It started again after that and AssignmentManager did a lot of assignments.

Looks like the cluster was operational this time.

Cheers

On Sun, May 17, 2015 at 8:24 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> bq. the backup master take over at 2015-05-15 12:16:40,024 ?
>
> The switch of active master should be earlier than 12:16:40,024 - shortly
> after 12:15:58
>
> l-namenode1 would do some initialization (such as waiting for region
> servers count to settle) after it became active master.
>
> I tried to download from http://pan.baidu.com/s/1eQlKXj0 (at home) but
> the download progress was very slow.
>
> Will try downloading later in the day.
>
> Do you have access to pastebin ?
>
> Cheers
>
> On Sun, May 17, 2015 at 2:07 AM, Louis Hust <louis.hust@gmail.com> wrote:
>
>> Hi, ted,
>>
>> Thanks for your reply!!
>>
>> I found the log in l-namenode2.dba.cn8 during the restarting progress:
>> 2015-05-15 12:11:36,540 INFO  [master:l-namenode2:60000]
>> master.ServerManager: Finished waiting for region servers count to settl=
e;
>> checked in 5, slept for 4511 ms, expecting minimum of 1, maximum of
>> 2147483647, master is running.
>>
>> So this means the HMaster ready for handle request at 12:11:36?
>>
>> The backup master is l-namenode1.dba.cn8 and you can get the log at :
>>
>> http://pan.baidu.com/s/1eQlKXj0
>>
>> After the l-namenode2.dba.cn8 is stopped by me at 12:15:58,
>> the backup master l-namenode1 take over, and i found log:
>>
>> 2015-05-15 12:16:40,024 INFO  [master:l-namenode1:60000]
>> master.ServerManager: Finished waiting for region servers count to settl=
e;
>> checked in 4, slept for 5663 ms, expecting minimum of 1, maximum of
>> 2147483647, master is running.
>>
>> So the backup master take over at 2015-05-15 12:16:40,024 ?
>>
>> But it seems the l-namenode2 not working normally with the exception in
>> log:
>>
>> 2015-05-15 12:16:40,522 INFO
>>  [MASTER_SERVER_OPERATIONS-l-namenode1:60000-0]
>> handler.ServerShutdownHandler: Finished processing of shutdown of
>> l-hbase31.data.cn8.qunar.com,60020,1427789773001
>> 2015-05-15 12:17:11,301 WARN  [686544788@qtp-660252776-212]
>> client.HConnectionManager$HConnectionImplementation: Checking master
>> connection
>> com.google.protobuf.ServiceException: java.net.ConnectException:
>> Connection
>> refused
>> at
>>
>> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:=
1667)
>> at
>>
>> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.c=
allBlockingMethod(RpcClient.java:1708)
>> at
>>
>> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$Bl=
ockingStub.isMasterRunning(MasterProtos.java:40216)
>> at
>>
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementat=
ion$MasterServiceState.isMasterRunning(HConnectionManager.java:1484)
>> at
>>
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementat=
ion.isKeepAliveMasterConnectedAndRunning(HConnectionManager.java:2110)
>> at
>>
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementat=
ion.getKeepAliveMasterService(HConnectionManager.java:1836)
>>
>> Is the exception means the HMaster not working normally or somewhat?
>>
>>
>>
>> 2015-05-17 11:06 GMT+08:00 Ted Yu <yuzhihong@gmail.com>:
>>
>> > bq. the HMaster is handling two region server down, and not ready to
>> handle
>> > client request?
>> >
>> > I didn't mean that - for a functioning master, handling region server
>> > shutdown is part of the master's job.
>> >
>> > You should see something similar to the following in (functioning)
>> master
>> > log:
>> >
>> > 2015-05-13 04:06:36,266 INFO  [master:c6401:60000] master.ServerManage=
r:
>> > Finished waiting for region servers count to settle; checked in 1, sle=
pt
>> > for 71582 ms, expecting minimum of 1, maximum of 2147483647, master is
>> > running.
>> >
>> > bq. wait the backend HMaster to take over
>> >
>> > Was there exception in backup master log after it took over ?
>> >
>> > On Sat, May 16, 2015 at 6:44 PM, Louis Hust <louis.hust@gmail.com>
>> wrote:
>> >
>> > > hi,ted,
>> > >
>> > > Thanks very much!
>> > >
>> > > Namenode process was not running on
>> l-namenode2.dba.cn8(192.168.39.22),
>> > > just HMaster run on it=E3=80=82
>> > > So you means that at 2015-05-15 12:15:04,  the HMaster is handling t=
wo
>> > > region server down, and not
>> > > ready to handle client request? And how can i tell when the HMaster =
is
>> > > ready to handle client request
>> > > from the logs?
>> > >
>> > > I stop the Hmaster at 12:15:58 cause the HMaster can not handling
>> > request,
>> > > so i want to stop it and wait
>> > > the backend HMaster to take over.
>> > >
>> > >
>> > >
>> > >
>> > > 2015-05-17 0:29 GMT+08:00 Ted Yu <yuzhihong@gmail.com>:
>> > >
>> > > > In the period you identified, master was assigning regions.
>> > > > e.g.
>> > > >
>> > > > 2015-05-15 12:13:09,683 INFO
>> > > > [l-namenode2.dba.cn8.qunar.com
>> > > ,60000,1431663090427-GeneralBulkAssigner-0]
>> > > > master.RegionStates: Transitioned {c634280ce287b2d6cebd88b61accf68=
5
>> > > > state=3DOFFLINE, ts=3D1431663189621, server=3Dnull} to
>> > > > {c634280ce287b2d6cebd88b61accf685 state=3DPENDING_OPEN,
>> ts=3D1431663189683,
>> > > > server=3Dl-hbase26.data.cn8.qunar.com,60020,1431462615651}
>> > > > 2015-05-15 12:13:09,683 INFO
>> > > > [l-namenode2.dba.cn8.qunar.com
>> > > ,60000,1431663090427-GeneralBulkAssigner-2]
>> > > > master.RegionStates: Transitioned {2f60b1b4e51d32ef98ad19690f13a56=
5
>> > > > state=3DOFFLINE, ts=3D1431663189621, server=3Dnull} to
>> > > > {2f60b1b4e51d32ef98ad19690f13a565 state=3DPENDING_OPEN,
>> ts=3D1431663189683,
>> > > > server=3Dl-hbase30.data.cn8.qunar.com,60020,1431462562233}
>> > > >
>> > > > Then two region servers went down:
>> > > >
>> > > > 2015-05-15 12:14:40,699 INFO  [main-EventThread]
>> > > > zookeeper.RegionServerTracker: RegionServer ephemeral node deleted=
,
>> > > > processing expiration [l-hbase27.data.cn8.qunar.com,60020,
>> > > >  1431663208899]
>> > > > 2015-05-15 12:15:04,899 INFO  [main-EventThread]
>> > > > zookeeper.RegionServerTracker: RegionServer ephemeral node deleted=
,
>> > > > processing expiration [l-hbase25.data.cn8.qunar.com,60020,
>> > > >  1431663193865]
>> > > >
>> > > > Master was stopped afterwards:
>> > > >
>> > > > Fri May 15 12:15:58 CST 2015 Terminating master
>> > > >
>> > > > Namenode process was running on l-namenode2.dba.cn8, right ?
>> > > >
>> > > > Cheers
>> > > >
>> > > > On Sat, May 16, 2015 at 7:50 AM, Louis Hust <louis.hust@gmail.com>
>> > > wrote:
>> > > >
>> > > > > hi, TED,
>> > > > > Any idea?
>> > > > > When the HMaster restart, how can i know when it is really can
>> handle
>> > > > > request from application? is there any mark in logs?
>> > > > >
>> > > > > 2015-05-16 14:05 GMT+08:00 Louis Hust <louis.hust@gmail.com>:
>> > > > >
>> > > > > > @Ted,
>> > > > > > plz see the log from 12:11:29 to 12:15:28, this timerange the
>> > HMaster
>> > > > is
>> > > > > > in restarting stage, but can not handle request from client? I=
s
>> the
>> > > > > HMaster
>> > > > > > recovering or do something else?
>> > > > > >
>> > > > > > 2015-05-16 13:59 GMT+08:00 Louis Hust <louis.hust@gmail.com>:
>> > > > > >
>> > > > > >> OK, you can get the log from
>> > > > > >> http://pan.baidu.com/s/1pqS6E
>> > > > > >>
>> > > > > >>
>> > > > > >> 2015-05-16 13:26 GMT+08:00 Ted Yu <yuzhihong@gmail.com>:
>> > > > > >>
>> > > > > >>> Can you check server log on 192.168.39.22
>> > > > > >>> <http://l-namenode2.dba.cn8.qunar.com/192.168.39.22:60000> ?
>> > > > > >>>
>> > > > > >>> That should give you some clue.
>> > > > > >>>
>> > > > > >>> Cheers
>> > > > > >>>
>> > > > > >>> On Fri, May 15, 2015 at 8:22 PM, Louis Hust <
>> > louis.hust@gmail.com>
>> > > > > >>> wrote:
>> > > > > >>>
>> > > > > >>> > Hi all,
>> > > > > >>> >
>> > > > > >>> > I use hbase0.96.0 with hadoop 2.2.0,
>> > > > > >>> > and the custom said they can not write into hbase cluster,
>> > > > > >>> > So i stop the HMaster and start it soon,
>> > > > > >>> >
>> > > > > >>> > But it seems that the HMaster not response to request,
>> > following
>> > > is
>> > > > > the
>> > > > > >>> > HMaster log:
>> > > > > >>> >
>> > > > > >>> > {logs}
>> > > > > >>> > 2015-05-15 12:13:33,136 INFO  [AM.ZK.Worker-pool2-t16]
>> > > > > >>> master.RegionStates:
>> > > > > >>> > Transitioned {9036a3befee90eeffb9082f90a4a9afa
>> state=3DOPENING,
>> > > > > >>> > ts=3D1431663212637, server=3Dl-hbase26.data.cn8.qunar.com
>> > > > > >>> ,60020,1431462615651}
>> > > > > >>> > to {9036a3befee90eeffb9082f90a4a9afa state=3DOPEN,
>> > > ts=3D1431663213136,
>> > > > > >>> server=3D
>> > > > > >>> > l-hbase26.data.cn8.qunar.com,60020,1431462615651}
>> > > > > >>> > 2015-05-15 12:13:33,139 INFO  [AM.ZK.Worker-pool2-t4]
>> > > > > >>> master.RegionStates:
>> > > > > >>> > Onlined 9036a3befee90eeffb9082f90a4a9afa on
>> > > > > >>> l-hbase26.data.cn8.qunar.com
>> > > > > >>> > ,60020,1431462615651
>> > > > > >>> > 2015-05-15 12:14:40,699 INFO  [main-EventThread]
>> > > > > >>> > zookeeper.RegionServerTracker: RegionServer ephemeral node
>> > > deleted,
>> > > > > >>> > processing expiration [l-hbase27.data.cn8.qunar.com
>> > > > > >>> ,60020,1431663208899]
>> > > > > >>> > 2015-05-15 12:15:04,899 INFO  [main-EventThread]
>> > > > > >>> > zookeeper.RegionServerTracker: RegionServer ephemeral node
>> > > deleted,
>> > > > > >>> > processing expiration [l-hbase25.data.cn8.qunar.com
>> > > > > >>> ,60020,1431663193865]
>> > > > > >>> > 2015-05-15 12:15:24,465 WARN  [249240421@qtp-591022857-33]
>> > > > > >>> > client.HConnectionManager$HConnectionImplementation:
>> Checking
>> > > > master
>> > > > > >>> > connection
>> > > > > >>> > com.google.protobuf.ServiceException:
>> > > > > java.net.SocketTimeoutException:
>> > > > > >>> Call
>> > > > > >>> > to l-namenode2.dba.cn8.qunar.com/192.168.39.22:60000 faile=
d
>> > > > because
>> > > > > >>> > java.net.SocketTimeoutException: 60000 millis timeout whil=
e
>> > > waiting
>> > > > > for
>> > > > > >>> > channel to be ready for read. ch :
>> > > > > >>> > java.nio.channels.SocketChannel[connected local=3D/
>> > > > 192.168.39.22:47700
>> > > > > >>> > remote=3D
>> > > > > >>> > l-namenode2.dba.cn8.qunar.com/192.168.39.22:60000]
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:=
1667)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.c=
allBlockingMethod(RpcClient.java:1708)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$Bl=
ockingStub.isMasterRunning(MasterProtos.java:40216)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementat=
ion$MasterServiceState.isMasterRunning(HConnectionManager.java:1484)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementat=
ion.isKeepAliveMasterConnectedAndRunning(HConnectionManager.java:2110)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementat=
ion.getKeepAliveMasterService(HConnectionManager.java:1836)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementat=
ion.listTables(HConnectionManager.java:2531)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > >
>> org.apache.hadoop.hbase.client.HBaseAdmin.listTables(HBaseAdmin.java:298=
)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.tmpl.master.MasterStatusTmplImpl.__jamon_innerUn=
it__userTables(MasterStatusTmplImpl.java:530)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.tmpl.master.MasterStatusTmplImpl.renderNoFlush(M=
asterStatusTmplImpl.java:255)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.tmpl.master.MasterStatusTmpl.renderNoFlush(Maste=
rStatusTmpl.java:382)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.tmpl.master.MasterStatusTmpl.render(MasterStatus=
Tmpl.java:372)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.master.MasterStatusServlet.doGet(MasterStatusSer=
vlet.java:95)
>> > > > > >>> > at
>> javax.servlet.http.HttpServlet.service(HttpServlet.java:734)
>> > > > > >>> > at
>> javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
>> > > > > >>> > at
>> > > > > >>>
>> > > >
>> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan=
dler.java:1221)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter=
(StaticUserWebFilter.java:109)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan=
dler.java:1212)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer=
.java:1081)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan=
dler.java:1212)
>> > > > > >>> > at
>> > > > >
>> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan=
dler.java:1212)
>> > > > > >>> > at
>> > > > > >>>
>> > > > >
>> > >
>> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2=
16)
>> > > > > >>> > at
>> > > > > >>>
>> > > > >
>> > >
>> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>> > > > > >>> > at
>> > > > > >>>
>> > > > >
>> > >
>> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>> > > > > >>> > at
>> > > > > >>>
>> > > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450=
)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandler=
Collection.java:230)
>> > > > > >>> > at
>> > > > > >>>
>> > > > >
>> > >
>> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>> > > > > >>> > at org.mortbay.jetty.Server.handle(Server.java:326)
>> > > > > >>> > at
>> > > > > >>>
>> > > >
>> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConne=
ction.java:928)
>> > > > > >>> > at
>> org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
>> > > > > >>> > at
>> > > org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>> > > > > >>> > at
>> > > org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:=
410)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java=
:582)
>> > > > > >>> > Caused by: java.net.SocketTimeoutException: Call to
>> > > > > >>> > l-namenode2.dba.cn8.qunar.com/192.168.39.22:60000 failed
>> > because
>> > > > > >>> > java.net.SocketTimeoutException: 60000 millis timeout whil=
e
>> > > waiting
>> > > > > for
>> > > > > >>> > channel to be ready for read. ch :
>> > > > > >>> > java.nio.channels.SocketChannel[connected local=3D/
>> > > > 192.168.39.22:47700
>> > > > > >>> > remote=3D
>> > > > > >>> > l-namenode2.dba.cn8.qunar.com/192.168.39.22:60000]
>> > > > > >>> > at
>> > > > > >>>
>> > > > >
>> > >
>> org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1475)
>> > > > > >>> > at
>> > > org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1450)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:=
1650)
>> > > > > >>> > ... 37 more
>> > > > > >>> > Caused by: java.net.SocketTimeoutException: 60000 millis
>> > timeout
>> > > > > while
>> > > > > >>> > waiting for channel to be ready for read. ch :
>> > > > > >>> > java.nio.channels.SocketChannel[connected local=3D/
>> > > > 192.168.39.22:47700
>> > > > > >>> > remote=3D
>> > > > > >>> > l-namenode2.dba.cn8.qunar.com/192.168.39.22:60000]
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:=
164)
>> > > > > >>> > at
>> > > > > >>>
>> > > > >
>> > >
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>> > > > > >>> > at
>> > > > > >>>
>> > > > >
>> > >
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>> > > > > >>> > at
>> java.io.FilterInputStream.read(FilterInputStream.java:133)
>> > > > > >>> > at
>> java.io.FilterInputStream.read(FilterInputStream.java:133)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.ipc.RpcClient$Connection$PingInputStream.read(Rp=
cClient.java:553)
>> > > > > >>> > at
>> > java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>> > > > > >>> > at
>> > java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>> > > > > >>> > at java.io.DataInputStream.readInt(DataInputStream.java:38=
7)
>> > > > > >>> > at
>> > > > > >>> >
>> > > > > >>> >
>> > > > > >>>
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.ipc.RpcClient$Connection.readResponse(RpcClient.=
java:1057)
>> > > > > >>> > at
>> > > > > >>>
>> > > > >
>> > >
>> org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:719)
>> > > > > >>> > Fri May 15 12:15:58 CST 2015 Terminating master
>> > > > > >>> > {/logs}
>> > > > > >>> > So what the exception means? Why? and how to solve the
>> problem?
>> > > > > >>> >
>> > > > > >>>
>> > > > > >>
>> > > > > >>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

--001a113cd73cb4a276051653059c--