hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen Yuan Jiang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14533) Thrift client gets "AsyncProcess: Failed to get region location .... closed"
Date Wed, 07 Oct 2015 23:18:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947790#comment-14947790
] 

Stephen Yuan Jiang commented on HBASE-14533:
--------------------------------------------

[~stack], is this the same as HBASE-14196?

> Thrift client gets "AsyncProcess: Failed to get region location .... closed"
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-14533
>                 URL: https://issues.apache.org/jira/browse/HBASE-14533
>             Project: HBase
>          Issue Type: Bug
>          Components: REST, Thrift
>    Affects Versions: 1.0.0
>            Reporter: stack
>         Attachments: test.patch
>
>
> An internal python client has been getting below stack trace since HBASE-134347
> {code}
> 2015-09-30 11:27:31,670 runner                    ERROR   : scheduler executor error
> 2015-09-30 11:27:31,674 runner                    ERROR   : Traceback (most recent call
last):
>   File "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsRtiFetcher-0.1-py2.6.egg/cops_rti/fetcher/runner.py",
line 82, in run
>     fetch_list = self.__scheduler_executor.run()
>   File "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsRtiFetcher-0.1-py2.6.egg/cops_rti/fetcher/scheduler.py",
line 35, in run
>     with self.__fetch_db_dao.get_scanner() as scanner:
>   File "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_dao.py",
line 57, in get_scanner
>     caching=caching, field_filter_list=field_filter_list)
>   File "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_client_template.py",
line 104, in get_entity_scanner
>     self.__fix_cfs(self.__filter_columns(field_filter_list)), caching)
>   File "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_entity_scanner.py",
line 81, in open
>     self.__scanner_id = client.scannerOpenWithScan(table_name, scan)
>   File "/opt/cops/cops-related-ticket-info-fetcher/.crepo/cops-hbase-common/ext-py/hbase/Hbase.py",
line 1494, in scannerOpenWithScan
>     return self.recv_scannerOpenWithScan()
>   File "/opt/cops/cops-related-ticket-info-fetcher/.crepo/cops-hbase-common/ext-py/hbase/Hbase.py",
line 1518, in recv_scannerOpenWithScan
>     raise result.io
> IOError: IOError(message="org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't
get the location\n\tat org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:308)\n\tat
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:149)\n\tat
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:57)\n\tat
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)\n\tat
org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:293)\n\tat org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268)\n\tat
org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)\n\tat
org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135)\n\tat org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:888)\n\tat
org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.scannerOpenWithScan(ThriftServerRunner.java:1446)\n\tat
sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
java.lang.reflect.Method.invoke(Method.java:606)\n\tat org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67)\n\tat
com.sun.proxy.$Proxy14.scannerOpenWithScan(Unknown Source)\n\tat org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4609)\n\tat
org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4593)\n\tat
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)\n\tat org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)\n\tat
org.apache.hadoop.hbase.thrift.ThriftServerRunner$3.process(ThriftServerRunner.java:502)\n\tat
org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat
java.lang.Thread.run(Thread.java:745)\nCaused by: java.io.IOException: hconnection-0xa8e1bf9
closed\n\tat org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1117)\n\tat
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:299)\n\t...
23 more\n")
> {code}
> On the thrift server side we see this:
> {code}
> 2015-09-30 07:22:59,427 ERROR org.apache.hadoop.hbase.client.AsyncProcess: Failed to
get region location
> java.io.IOException: hconnection-0x4142991e closed
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1117)
>         at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:369)
>         at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:320)
>         at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:206)
>         at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:183)
>         at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1496)
>         at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1107)
>         at org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRowTs(ThriftServerRunner.java:1256)
>         at org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRow(ThriftServerRunner.java:1209)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67)
>         at com.sun.proxy.$Proxy14.mutateRow(Unknown Source)
>         at org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$mutateRow.getResult(Hbase.java:4334)
>         at org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$mutateRow.getResult(Hbase.java:4318)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>         at org.apache.hadoop.hbase.thrift.ThriftServerRunner$3.process(ThriftServerRunner.java:502)
>         at org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> HBASE-13437 has us actual execute a close on timeout -- before we'd mark connection closed
but would never call close on it.
> A background chore is going around stamping Connections in the ConnectionCache as 'closed'
if they have not been used in ten minutes. The 'close' can come in at any time..... In particular
between the point at which we get the table/connection and when we go to use it: i.e. flush
puts.  It is at the flush puts point that we get the above 'AsyncProcess: Failed to get region
location' (It is not a failure to find region location but rather our noticing that the connection
has been closed).
> Attempts at reproducing this issue locally letting the Connection timeout can generate
the above exception if a certain dance is done but it is hard to do; I am not reproducing
the actual usage by the aforementioned client.
> Next steps would be setting up python client talking via thrift and then try using connection
after it has been evicted from the connection cache. Another thing to try is a pool of connections
on the python side...connections are identified by user and table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message