hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jingcheng Du (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-15959) Fix flaky test TestRegionServerMetrics.testMobMetrics
Date Wed, 08 Jun 2016 06:19:21 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320076#comment-15320076
] 

Jingcheng Du edited comment on HBASE-15959 at 6/8/16 6:18 AM:
--------------------------------------------------------------

Oops, my mistake. The compacted files are not directly removed any more in the code of master
branch (instead it uses a region server chore to do that). Not sure why it needs this after
compaction. But this can lead to mob issues just in this case.
Let me give a example, we have three store files, sf#1 has a seqId as 5, sf#2 has a seqId
as 6, and sf#3 has a seqId as 7, after compaction we have a new store file sf#4 who has a
seqId as 7. In normal cases, sF#1-3 are removed from the store files in the context (not from
file system).
After the {{region.initialize()}}, some of cells come from sf#3 replace the cells in sf#4,
this breaks the test. But the WAL is supposed to guarantee the correctness after region is
restarted. But the WAL is not replayed in {{region.initialize()}} in this case, the number
of found HLogs is 0, 


was (Author: jingcheng.du@intel.com):
Oops, my mistake. The compacted files are not directly removed any more in the code of master
branch (instead it uses a region server chore to do that). Not sure why it needs this after
compaction. But this can lead to mob issues just in this case.
Let me give a example, we have three store files, sf#1 has a seqId as 5, sf#2 has a seqId
as 6, and sf#3 has a seqId as 7, after compaction we have a new store file sf#4 who has a
a seqId as 6. In normal cases, sF#1-3 are removed from the store files in the context (not
from file system).
After the {{region.initialize()}}, some of cells come from sf#3 replace the cells in sf#4,
this breaks the test. But the WAL is supposed to guarantee the correctness after region is
restarted. But the WAL is not replayed in {{region.initialize()}} in this case, the number
of found HLogs is 0, 

> Fix flaky test TestRegionServerMetrics.testMobMetrics
> -----------------------------------------------------
>
>                 Key: HBASE-15959
>                 URL: https://issues.apache.org/jira/browse/HBASE-15959
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Appy
>            Assignee: huaxiang sun
>
> It flakes [here|https://github.com/apache/hbase/blob/b557f0bec62a48753e5d01d7a47f3c9e5a6b3ee8/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java#L460].
> There are two weird things i identified:
> 1. In second compaction, [scanner|https://github.com/apache/hbase/blob/b557f0bec62a48753e5d01d7a47f3c9e5a6b3ee8/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/DefaultMobStoreCompactor.java#L173]
has 10 storefiles. Shouldn't there be 6? 5 from recent flushes and 1 from earlier compaction.
Probably because mob cleaner doesn't clean old hfiles. Does this needs fixing?
> 2. Across runs, same cell (ie. same key) may or may not be considered mob reference cell.
[here|https://github.com/apache/hbase/blob/b557f0bec62a48753e5d01d7a47f3c9e5a6b3ee8/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/DefaultMobStoreCompactor.java#L213].
This at least happens with row keys 0 - 4 (which got compacted earlier). [~jmhsieh] Any ideas
why this would happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message