hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhangduo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13011) TestLoadIncrementalHFiles is flakey when using AsyncRpcClient as client implementation
Date Thu, 12 Feb 2015 14:25:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318234#comment-14318234
] 

zhangduo commented on HBASE-13011:
----------------------------------

Oh, there is a patch already.
IMHO, the patch does not work...Certainly this patch could reduce the probability of writing
one call twice, but it can not prevent all...
Let's see
t1 check call.writeLock, it is false
t2 check call.writeLock, it is still false
t1 set call.writeLock to true and writeRequest
t2 set call.writeLock to true and writeRequest
OK, call is written twice...

Of course there are synchronization methods that could work without a lock, but these methods
are all complicated I'd say...

> TestLoadIncrementalHFiles is flakey when using AsyncRpcClient as client implementation
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-13011
>                 URL: https://issues.apache.org/jira/browse/HBASE-13011
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0, 1.1.0
>            Reporter: zhangduo
>            Assignee: zhangduo
>             Fix For: 2.0.0, 1.1.0
>
>         Attachments: HBASE-13011.patch, HBASE-13011_1.patch
>
>
> The test sometimes failed because of timeout.
> https://builds.apache.org/job/PreCommit-HBASE-Build/12769/testReport/junit/org.apache.hadoop.hbase.mapreduce/TestLoadIncrementalHFiles/testSimpleLoad/
> Dig into it, I found this
> {noformat}
> 2015-02-11 02:01:47,304 INFO  [LoadIncrementalHFiles-1] mapreduce.LoadIncrementalHFiles(563):
Trying to load hfile=hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1
first=ddd last=ooo
> 2015-02-11 02:01:47,308 INFO  [LoadIncrementalHFiles-0] mapreduce.LoadIncrementalHFiles(563):
Trying to load hfile=hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0
first=aaaa last=cccc
> 2015-02-11 02:01:47,317 DEBUG [LoadIncrementalHFiles-2] mapreduce.LoadIncrementalHFiles$3(664):
Going to connect to server region=bulkNS:mytable_testSimpleLoad,,1423620104753.fdcbd21e43683c753bae40f1d890daa6.,
hostname=asf910.gq1.ygridcore.net,41003,1423620099272, seqNum=2 for row  with hfile group
[{[B@7173d25a,hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0}]
> 2015-02-11 02:01:47,320 DEBUG [LoadIncrementalHFiles-3] mapreduce.LoadIncrementalHFiles$3(664):
Going to connect to server region=bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.,
hostname=asf910.gq1.ygridcore.net,41003,1423620099272, seqNum=2 for row ddd with hfile group
[{[B@7173d25a,hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1}]
> {noformat}
> There are two files to commit, but after this
> {noformat}
> 2015-02-11 02:01:47,327 INFO  [B.defaultRpcServer.handler=3,queue=0,port=41003] regionserver.HStore(690):
Validating hfile at hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0
for inclusion in store myfam region bulkNS:mytable_testSimpleLoad,,1423620104753.fdcbd21e43683c753bae40f1d890daa6.
> 2015-02-11 02:01:47,330 INFO  [B.defaultRpcServer.handler=1,queue=0,port=41003] regionserver.HStore(690):
Validating hfile at hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1
for inclusion in store myfam region bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.
> 2015-02-11 02:01:47,330 INFO  [B.defaultRpcServer.handler=4,queue=0,port=41003] regionserver.HStore(690):
Validating hfile at hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1
for inclusion in store myfam region bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.
> {noformat}
> We can see that hfile_1 have been committed twice and the second call will fail and cause
the test timeout.
> I'm not sure if it is a issue of AsyncRpcClient. But if I use RpcClientImpl, the test
always passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message