hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guanghao Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18003) Fix flaky test TestAsyncTableAdminApi and TestAsyncRegionAdminApi
Date Thu, 25 May 2017 10:47:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024522#comment-16024522
] 

Guanghao Zhang commented on HBASE-18003:
----------------------------------------

{code}
2017-05-25 17:56:34,967 INFO  [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=50801] master.HMaster$11(2297):
Client=hao//127.0.0.1 disable testModifyColumnFamily
2017-05-25 17:56:37,974 INFO  [RpcClient-timer-pool1-t1] client.AsyncHBaseAdmin$TableProcedureBiConsumer(2219):
Operation: DISABLE, Table Name: default:testModifyColumnFamily failed with Failed after attempts=3,
exceptions: 
Thu May 25 17:56:35 CST 2017, , java.io.IOException: Call to localhost/127.0.0.1:50801 failed
on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=294, waitTime=1008,
rpcTimeout=1000
Thu May 25 17:56:37 CST 2017, , java.io.IOException: Call to localhost/127.0.0.1:50801 failed
on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=295, waitTime=1299,
rpcTimeout=1000
Thu May 25 17:56:37 CST 2017, , java.io.IOException: Call to localhost/127.0.0.1:50801 failed
on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=296, waitTime=668,
rpcTimeout=660
017-05-25 17:56:38,936 DEBUG [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=50801] procedure2.ProcedureExecutor(788):
Stored procId=15, owner=hao, state=RUNNABLE:DISABLE_TABLE_PREPARE, DisableTableProcedure table=testModifyColumnFamily
{code}

For this disable table procedure, master return the procedure id when it submit the procedure
to ProcedureExecutor. And the above procedure take 4 seconds to submit. So the disable table
call failed because the rpc timeout is 1 seconds and the retry number is 3.
For admin operation, I thought we don't need change the default timeout config. And the retry
is not need, too. (Or we can set a retry > 1 to test nonce thing). Meanwhile, the default
timeout is 60 seconds. So the test type may need change to LargeTests. I will open a new issue
to change all TestAsync*AdminApi config. [~openinx]

> Fix flaky test TestAsyncTableAdminApi and TestAsyncRegionAdminApi
> -----------------------------------------------------------------
>
>                 Key: HBASE-18003
>                 URL: https://issues.apache.org/jira/browse/HBASE-18003
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client
>            Reporter: Guanghao Zhang
>            Assignee: Zheng Hu
>             Fix For: 2.0.0
>
>         Attachments: HBASE-18003.v1.patch, HBASE-18003.v2.patch, HBASE-18003.v2.patch
>
>
> See https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message