hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allan Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17506) started mvcc transaction is not completed in branch-1
Date Wed, 25 Jan 2017 05:42:26 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837228#comment-15837228

Allan Yang commented on HBASE-17506:

there will not be different nonce inside a single WALEntry. Nonce is by RPC request. WALEntry
is a subparticle of an rpc request.
If there is no different nonce inside a single WALEntry, then we can remove those code. Those
code is only used in replay mode(region replica or log replay)

> started mvcc transaction is not completed in branch-1
> -----------------------------------------------------
>                 Key: HBASE-17506
>                 URL: https://issues.apache.org/jira/browse/HBASE-17506
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.4.0
>            Reporter: Allan Yang
>            Assignee: Allan Yang
> In {{doMiniBatchMutation}}, if it is in replay and if the the nonce of the mutation is
different, we append them to a different wal.  But, after HBASE-14465, we start a mvcc transition
in the ringbuffer's append thread.  So, every time we append a wal entry, we started a mvcc
transition, but we didn't complete the mvcc transition anywhere. This can block other transition
of this region.
> {code}
> // txid should always increase, so having the one from the last call is ok.
> // we use HLogKey here instead of WALKey directly to support legacy coprocessors.
>             walKey = new ReplayHLogKey(this.getRegionInfo().getEncodedNameAsBytes(),
>               this.htableDescriptor.getTableName(), now, m.getClusterIds(),
>               currentNonceGroup, currentNonce, mvcc);
>             txid = this.wal.append(this.htableDescriptor,  this.getRegionInfo(),  walKey,
>               walEdit, true);
>             walEdit = new WALEdit(cellCount, isInReplay);
>             walKey = null;
> {code}
> Looked at master branch, there is no such problem. It has a method named{{appendCurrentNonces}}
> {code}
>  private void appendCurrentNonces(final Mutation mutation, final boolean replay,
>        final WALEdit walEdit, final long now, final long currentNonceGroup, final long
>    throws IOException {
>      if (walEdit.isEmpty()) return;
>      if (!replay) throw new IOException("Multiple nonces per batch and not in replay");
>      WALKey walKey = new WALKey(this.getRegionInfo().getEncodedNameAsBytes(),
>          this.htableDescriptor.getTableName(), now, mutation.getClusterIds(),
>          currentNonceGroup, currentNonce, mvcc, this.getReplicationScope());
>      this.wal.append(this.getRegionInfo(), walKey, walEdit, true);
>      // Complete the mvcc transaction started down in append else it will block others
>      this.mvcc.complete(walKey.getWriteEntry());
>    }
> {code}
> Yes, the easiest way to fix branch-1 is to complete the writeEntry like master branch
do. But is it really fine to do this?
> 1. :
> complete the mvcc transition before waiting sync will create a disturbance of data visibility.
> 2.:
> In what circumstance will there be different nonce and nonce group in a single wal entry?
Nonce are used in append/increment. But in {{batchMuate}} ,we treat them differently and append
one wal entry for each of them. So I think no test can reach this code path,  that maybe why
no one has found this bug(Please tell me if I'm wrong).

This message was sent by Atlassian JIRA

View raw message