hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhangduo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13011) TestLoadIncrementalHFiles is flakey when using AsyncRpcClient as client implementation
Date Fri, 13 Feb 2015 05:12:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319592#comment-14319592
] 

zhangduo commented on HBASE-13011:
----------------------------------

TestHMasterRPCException is flakey. I think the problem is the test itself.

The test try to connect to HMaster several times until it getting ServerNotRunningYetException.
But we do not set any guard to prevent HMaster transfering its state to running, so it could
happen that when we successfully connect to HMaster, it is already under the running state(especially
on heavy loaded machines)...

And I can not view the log file of other failed tests, maybe something wrong with jenkins?
I ran these tests locally, they all passed.

> TestLoadIncrementalHFiles is flakey when using AsyncRpcClient as client implementation
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-13011
>                 URL: https://issues.apache.org/jira/browse/HBASE-13011
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0, 1.1.0
>            Reporter: zhangduo
>            Assignee: zhangduo
>             Fix For: 2.0.0, 1.1.0
>
>         Attachments: HBASE-13011.patch, HBASE-13011_1.patch, HBASE-13011_2.patch
>
>
> The test sometimes failed because of timeout.
> https://builds.apache.org/job/PreCommit-HBASE-Build/12769/testReport/junit/org.apache.hadoop.hbase.mapreduce/TestLoadIncrementalHFiles/testSimpleLoad/
> Dig into it, I found this
> {noformat}
> 2015-02-11 02:01:47,304 INFO  [LoadIncrementalHFiles-1] mapreduce.LoadIncrementalHFiles(563):
Trying to load hfile=hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1
first=ddd last=ooo
> 2015-02-11 02:01:47,308 INFO  [LoadIncrementalHFiles-0] mapreduce.LoadIncrementalHFiles(563):
Trying to load hfile=hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0
first=aaaa last=cccc
> 2015-02-11 02:01:47,317 DEBUG [LoadIncrementalHFiles-2] mapreduce.LoadIncrementalHFiles$3(664):
Going to connect to server region=bulkNS:mytable_testSimpleLoad,,1423620104753.fdcbd21e43683c753bae40f1d890daa6.,
hostname=asf910.gq1.ygridcore.net,41003,1423620099272, seqNum=2 for row  with hfile group
[{[B@7173d25a,hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0}]
> 2015-02-11 02:01:47,320 DEBUG [LoadIncrementalHFiles-3] mapreduce.LoadIncrementalHFiles$3(664):
Going to connect to server region=bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.,
hostname=asf910.gq1.ygridcore.net,41003,1423620099272, seqNum=2 for row ddd with hfile group
[{[B@7173d25a,hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1}]
> {noformat}
> There are two files to commit, but after this
> {noformat}
> 2015-02-11 02:01:47,327 INFO  [B.defaultRpcServer.handler=3,queue=0,port=41003] regionserver.HStore(690):
Validating hfile at hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0
for inclusion in store myfam region bulkNS:mytable_testSimpleLoad,,1423620104753.fdcbd21e43683c753bae40f1d890daa6.
> 2015-02-11 02:01:47,330 INFO  [B.defaultRpcServer.handler=1,queue=0,port=41003] regionserver.HStore(690):
Validating hfile at hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1
for inclusion in store myfam region bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.
> 2015-02-11 02:01:47,330 INFO  [B.defaultRpcServer.handler=4,queue=0,port=41003] regionserver.HStore(690):
Validating hfile at hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1
for inclusion in store myfam region bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.
> {noformat}
> We can see that hfile_1 have been committed twice and the second call will fail and cause
the test timeout.
> I'm not sure if it is a issue of AsyncRpcClient. But if I use RpcClientImpl, the test
always passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message