hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bi,hongyu—mike <boyl...@gmail.com>
Subject Re: client call scan on some region hang
Date Wed, 07 Jan 2015 02:37:14 GMT
Thanks Ted,
Finally I resolved the issue, the RC is :region_mover will call
isSuccessfulScan to scan the startkey of the moved region which filled with
lots of expired cells,so it seems scan hang;
I think isSuccessfulScan is just to test whether the moved region is
readable or not, why not to use get instead which may avoid such case



2015-01-06 20:59 GMT+08:00 Ted Yu <yuzhihong@gmail.com>:

> Can you pastebin region server log ?
>
> When the scan is being performed, can you get jstack and pastebin it ?
>
> 0.94.15 was an old release, any chance of upgrade ?
>
> Thanks
>
>
>
> > On Jan 6, 2015, at 2:34 AM, Bi,hongyu—mike <boylook@gmail.com> wrote:
> >
> > sorry , forgot to attach the version: 0.94.15;
> >
> > and i call compact (as well as many times flush region) from hbase shell
> > didn't take effect, no compaction happened;
> >
> > 2015-01-06 18:26 GMT+08:00 Bi,hongyu—mike <boylook@gmail.com>:
> >
> >> scan debug log:
> >> 15/01/06 18:20:56 DEBUG client.ClientScanner: Creating scanner over T
> >> starting at key 'Rowx'
> >> 15/01/06 18:20:56 DEBUG client.ClientScanner: Advancing internal scanner
> >> to startKey at 'Rowx'
> >> 15/01/06 18:20:56 DEBUG client.MetaScanner: Scanning .META. starting at
> >> row=XXXX for max=10 rows using
> >>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@427b7b5d
> >> 15/01/06 18:20:56 DEBUG
> >> client.HConnectionManager$HConnectionImplementation: Cached location for
> >> <THAT REGION> is RS_IP:60020
> >> ......
> >> 15/01/06 18:21:16 DEBUG zookeeper.ClientCnxn: Got ping response for
> >> sessionid: 0x3499df682b076cf after 0ms
> >> 15/01/06 18:21:36 DEBUG zookeeper.ClientCnxn: Got ping response for
> >> sessionid: 0x3499df682b076cf after 0ms
> >> 15/01/06 18:21:56 DEBUG zookeeper.ClientCnxn: Got ping response for
> >> sessionid: 0x3499df682b076cf after 0ms
> >> 15/01/06 18:21:56 DEBUG zookeeper.ClientCnxn: Reading reply
> >> sessionid:0x3499df682b076cf, packet:: clientPath:null serverPath:null
> >> finished:false header:: 9,4  replyHeader:: 9,21519728740,-101  request::
> >> '/hbase/table/T,F  response::
> >> 15/01/06 18:21:56 DEBUG
> >> client.HConnectionManager$HConnectionImplementation: Removed <THAT
> REGION>
> >> for tableName=T from cache because of Rowx
> >> 15/01/06 18:21:56 DEBUG
> >> client.HConnectionManager$HConnectionImplementation: Cached location for
> >> <THAT REGION> is RS_IP:60020
> >> 15/01/06 18:21:56 DEBUG client.ClientScanner: Advancing internal scanner
> >> to startKey at 'Rowx'
> >>
> >> 2015-01-06 18:09 GMT+08:00 Bi,hongyu—mike <boylook@gmail.com>:
> >>
> >>> write traffic is ok:
> >>> 2015-01-06 17:46:01,127 WARN org.apache.hadoop.hbase.ipc.SecureServer:
> >>> (responseTooSlow): {"processingtimems":68,"call":"multi(Region=Rx of
> 149
> >>> actions and first row key= Rowx), rpc version=1, client version=29,
> >>> methodsFingerPrint=-1105746420","client":"IP:port}
> >>>
> >>> scan on that region slow:
> >>> 015-01-06 16:23:25,087 ERROR
> >>> org.apache.hadoop.hbase.regionserver.HRegionServer:
> >>> org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting on
> >>> region Rx, call next(8002464006782223710, 1, 0), rpc version=1, client
> >>> version=29, methodsFingerPrint=-1771721648 from 10.201.202.31:31285
> >>> after 87821 ms, since caller disconnected
> >>>        at
> >>>
> org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:438);
> >>>
> >>> hbase hfile -r 'Rx' -p     can produce the result
> >>>
> >>> 2015-01-06 18:03 GMT+08:00 Bi,hongyu—mike <boylook@gmail.com>:
> >>>
> >>>> Hi  all,
> >>>>
> >>>> There's one region which can take write request but scan;
> >>>> If I scan on that region I'll get scanner lease timeout(60s by
> >>>> default),while I can scan other region of  the same table and get
the
> >>>> result less than 10ms(our slow rpc threadhold is 10ms);
> >>>>
> >>>> hbck report OK, and I use "hbase hfile" tool to check that region's
> >>>> storefile and the region ,which all extract the result;
> >>>>
> >>>> so I don't have any idea on it...
> >>>> any help will be appreciate, many thanks!
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message