hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8741) Scope sequenceid to the region rather than regionserver (WAS: Mutations on Regions in recovery mode might have same sequenceIDs)
Date Mon, 12 Aug 2013 18:06:49 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737128#comment-13737128
] 

stack commented on HBASE-8741:
------------------------------

Is this in the initialize method [~himanshu@cloudera.com]?

+      } else {
+        this.regionSequenceId.set(nextSeqId);
       }

If so, doesn't it usually return the seqid as a result?  Are you changing it in the above?

I dug in on this:

           status.setStatus("Flush will not be started for [" + this.getRegionInfo().getEncodedName()
-              + "] - WAL is going away");

Are we for sure the WAL is going away?  Is that the only reason flush call can come back false?
 Would suggest not make what could be a false report... just leave this 'WAL is going away'
off the message.

Are we sure that edits will go into the WAL in order?  I see us assign the ids up in HRegion
in order and then pass off to the WAL to write in append but I wonder if the thread could
be stalled before it gets to the synchronized (this.updateLock) in append and another thread
with a higher seqid get a write in ahead of us?  Would it be better to pass the AtomicLong
reference into FSHLog and let it do the increment under the this.updateLock?

Why do we have an Interface named LastSequenceId and then another named SequenceIds?  Should
the one Interface provide bother services?  They certainly look related (get all sequenceids
for all regions and get the sequenceid for a particular region).  They have the same implementor
(the regionserver)

Why would you make this change breaking the formatting (removing the white space)?

-        this.region.getRegionInfo(), compactionDescriptor);
+        this.region.getRegionInfo(),compactionDescriptor,

Where is the format for WAL logs set? Should the note in LOG_FILES_COMPARATOR refer to it
else someone may change the format and break this comparator.

On the below:

+   * @param sequenceIds a service to get regions information while rolling.

It is not 'information', it is sequenceids for all regions involved in a WAL?

What happens if sequenceids is null?

+          if (sequenceIds != null) {
+            regionsToSequenceId = sequenceIds.obtainRegionsSequenceId();
+          }

Does stuff work anyways?  That would be good in tests I'd imagine.

No code in the file you are editing looks like the below:

+      sequenceId+=countPerFamily;// increasing sequence Id counter by number of appends.

Why introduce a new style?  You do it in a few places.

Or here...
+      // Inserted 6k entries; seqId would b 6001, and sequenceId would be 6001 too.
+          assertTrue("seeId: " + seqid +", sequenceId: "+sequenceId, seqid == sequenceId);

These are very minor nits but they make me not trust your patch if this basic stuff is not
right.

Are you repeating this code for each test?

+    SequenceIds mockSequenceIds = Mockito.mock(SequenceIds.class);
+    final Map<byte[], Long> regionsSequenceIds = new HashMap<byte[], Long>();
+    Mockito.when(mockSequenceIds.obtainRegionsSequenceId()).thenAnswer(
+      new Answer<Map<byte[], Long>>() {
+        @Override
+        public Map<byte[], Long> answer(InvocationOnMock invocation) throws Throwable
{
+          return new HashMap<byte[], Long>(regionsSequenceIds);
+        }
+      });

Why not do it once on test setup?   Or put it in the junit setup method rather than duplicate
code?

I am half way through.  That is enough feedback for now.


 
                
> Scope sequenceid to the region rather than regionserver (WAS: Mutations on Regions in
recovery mode might have same sequenceIDs)
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-8741
>                 URL: https://issues.apache.org/jira/browse/HBASE-8741
>             Project: HBase
>          Issue Type: Bug
>          Components: MTTR
>    Affects Versions: 0.95.1
>            Reporter: Himanshu Vashishtha
>            Assignee: Himanshu Vashishtha
>             Fix For: 0.98.0, 0.95.2
>
>         Attachments: HBASE-8741-v0.patch, HBASE-8741-v2.patch, HBASE-8741-v3.patch, HBASE-8741-v4-again.patch,
HBASE-8741-v4-again.patch, HBASE-8741-v4.patch, HBASE-8741-v5-again.patch, HBASE-8741-v5.patch
>
>
> Currently, when opening a region, we find the maximum sequence ID from all its HFiles
and then set the LogSequenceId of the log (in case the later is at a small value). This works
good in recovered.edits case as we are not writing to the region until we have replayed all
of its previous edits. 
> With distributed log replay, if we want to enable writes while a region is under recovery,
we need to make sure that the logSequenceId > maximum logSequenceId of the old regionserver.
Otherwise, we might have a situation where new edits have same (or smaller) sequenceIds. 
> We can store region level information in the WALTrailer, than this scenario could be
avoided by:
> a) reading the trailer of the "last completed" file, i.e., last wal file which has a
trailer and,
> b) completely reading the last wal file (this file would not have the trailer, so it
needs to be read completely).
> In future, if we switch to multi wal file, we could read the trailer for all completed
WAL files, and reading the remaining incomplete files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message