Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@phoenix.apache.org
Date: Fri, 14 Apr 2017 22:03:41 +0000 (UTC)
From: "James Taylor (JIRA)" <jira@apache.org>
To: dev@phoenix.apache.org
Message-ID: <JIRA.13064136.1492198618000.287713.1492207421585@Atlassian.JIRA>
In-Reply-To: <JIRA.13064136.1492198618000@Atlassian.JIRA>
References: <JIRA.13064136.1492198618000@Atlassian.JIRA> <JIRA.13064136.1492198618270@jira-lw-us.apache.org>
Subject: [jira] [Commented] (PHOENIX-3789) Execute cross region index
 maintenance calls outside of row lock
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Fri, 14 Apr 2017 22:03:47 -0000


    [ https://issues.apache.org/jira/browse/PHOENIX-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969593#comment-15969593 ] 

James Taylor commented on PHOENIX-3789:
---------------------------------------

bq. If the mutation's durability is SKIP_WAL then Indexer will try to update indexes out of the PRE hook (preBatchMutate). Good news is we're only under the region read lock here, not rowlocks.

I'm looking at HRegion.doMiniBatchMutation() and it seems to me that row locks are in place during preBatchMutate (see getRowLockInternal here):
{code}
        // If we haven't got any rows in our batch, we should block to
        // get the next one.
        boolean shouldBlock = numReadyToWrite == 0;
        RowLock rowLock = null;
        try {
          rowLock = getRowLockInternal(mutation.getRow(), shouldBlock);
        } catch (IOException ioe) {
          LOG.warn("Failed getting lock in batch put, row="
            + Bytes.toStringBinary(mutation.getRow()), ioe);
        }
        if (rowLock == null) {
          // We failed to grab another lock
          break; // stop acquiring more rows for this batch
        } else {
          acquiredRowLocks.add(rowLock);
        }

        lastIndexExclusive++;
        numReadyToWrite++;

        if (isPutMutation) {
          // If Column Families stay consistent through out all of the
          // individual puts then metrics can be reported as a mutliput across
          // column families in the first put.
          if (putsCfSet == null) {
            putsCfSet = mutation.getFamilyCellMap().keySet();
          } else {
            putsCfSetConsistent = putsCfSetConsistent
                && mutation.getFamilyCellMap().keySet().equals(putsCfSet);
          }
        } else {
          if (deletesCfSet == null) {
            deletesCfSet = mutation.getFamilyCellMap().keySet();
          } else {
            deletesCfSetConsistent = deletesCfSetConsistent
                && mutation.getFamilyCellMap().keySet().equals(deletesCfSet);
          }
        }
      }

      // we should record the timestamp only after we have acquired the rowLock,
      // otherwise, newer puts/deletes are not guaranteed to have a newer timestamp
      now = EnvironmentEdgeManager.currentTimeMillis();
      byte[] byteNow = Bytes.toBytes(now);

      // Nothing to put/delete -- an exception in the above such as NoSuchColumnFamily?
      if (numReadyToWrite <= 0) return 0L;

      // We've now grabbed as many mutations off the list as we can

      // ------------------------------------
      // STEP 2. Update any LATEST_TIMESTAMP timestamps
      // ----------------------------------
      for (int i = firstIndex; i < lastIndexExclusive; i++) {
        // skip invalid
        if (batchOp.retCodeDetails[i].getOperationStatusCode()
            != OperationStatusCode.NOT_RUN) continue;

        Mutation mutation = batchOp.getMutation(i);
        if (mutation instanceof Put) {
          updateKVTimestamps(familyMaps[i].values(), byteNow);
          noOfPuts++;
        } else {
          if (!isInReplay) {
            prepareDeleteTimestamps(mutation, familyMaps[i], byteNow);
          }
          noOfDeletes++;
        }
        rewriteCellTags(familyMaps[i], mutation);
      }

      lock(this.updatesLock.readLock(), numReadyToWrite);
      locked = true;

      //
      // ------------------------------------
      // Acquire the latest mvcc number
      // ----------------------------------
      w = mvcc.beginMemstoreInsert();

      // calling the pre CP hook for batch mutation
      if (!isInReplay && coprocessorHost != null) {
        MiniBatchOperationInProgress<Mutation> miniBatchOp =
          new MiniBatchOperationInProgress<Mutation>(batchOp.getMutationsForCoprocs(),
          batchOp.retCodeDetails, batchOp.walEditsFromCoprocessors, firstIndex, lastIndexExclusive);
        if (coprocessorHost.preBatchMutate(miniBatchOp)) return 0L;
      }
{code}
 but are *not* in place during postBatchMutate (see releaseRowLocks right before STEP 7):
{code}
      // -------------------------------
      // STEP 6. Release row locks, etc.
      // -------------------------------
      if (locked) {
        this.updatesLock.readLock().unlock();
        locked = false;
      }
      releaseRowLocks(acquiredRowLocks);

      // -------------------------
      // STEP 7. Sync wal.
      // -------------------------
      if (hasWalAppends) {
        syncOrDefer(txid, durability);
      }
      doRollBackMemstore = false;
      // calling the post CP hook for batch mutation
      if (!isInReplay && coprocessorHost != null) {
        MiniBatchOperationInProgress<Mutation> miniBatchOp =
          new MiniBatchOperationInProgress<Mutation>(batchOp.getMutationsForCoprocs(),
          batchOp.retCodeDetails, batchOp.walEditsFromCoprocessors, firstIndex, lastIndexExclusive);
        coprocessorHost.postBatchMutate(miniBatchOp);
      }
{code}

If that's the case, there's no need for this patch. Am I missing something, [~apurtell] & [~gjacoby]?

> Execute cross region index maintenance calls outside of row lock
> ----------------------------------------------------------------
>
>                 Key: PHOENIX-3789
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3789
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: James Taylor
>             Fix For: 4.11.0, 4.10.1
>
>         Attachments: PHOENIX-3789.patch, PHOENIX-3789_v2.patch
>
>
> Making cross region server calls while the row is locked can lead to a greater chance of resource starvation. We can use the postBatchMutateIndispensably hook instead of the postBatchMutate call for our processing.


--
This message was sent by Atlassian JIRA
(v6.3.15#6346)