hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiroshi Ikeda (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15716) HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read
Date Tue, 05 Jul 2016 08:51:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15362226#comment-15362226

Hiroshi Ikeda commented on HBASE-15716:

bq. I should make getReadPoint private. It is only for use in this class. I should remove
'abstract long getMvccReadPoint();'?

Sorry I forgot to delete getReadPoint. I experimentally saw how the code becomes messy if
I collect the duplicate logic about getting a read point with a isolation level from HRegion.getReadpoint
(as a comment I wrote a few days ago).

The abstract method getMvccReadPoint is needed to avoid taking the whole MultiVersionConsistencyControl
instance, that is regrettably large concrete object. Using such a concrete class would make
our class unnecessarily be noisy and off the point, and also it would be difficult to create
a test.

bq. On this comment, " // Ignore the result; Another thread already did for you.", you mean
another thread has move the tail on further than what we wanted to set it too?

Yes. In fact it is not needed to do that operation here, because that operation will be eventually
done when it is actually required. We just allow the thread to wipe up its operation's mess
and/or help GC just a little.

bq. Should we be upping the tail reference count? We only checks if the tail.readPoint is
less than mvccReadPoint. We didn't check it is equal. Could tail be > mvccReadPoint? It
shouldn't ever be I suppose.. Is that what you are depending on here?

Do you mean logic in getEntry? In this case, you are right and always tail.readPoint <=
mvccReadpoint because getting the tail happens before getting the mvccReadPoint. Even if the
relation would be not established, tail.readPoint > mvccReadPoint means, the mvccReadPoint
is already stale and another thread updates the tail with the newer mvccReadPoint, and then
it would be enough to get a lift in it. It is enough to get the read point which is equal
to or newer than the mvccReadPoint at the moment you enter the method.

Or do you mean about the mvccReadPoint's advancing? I wrote getMvccReadPoint is always advancing
in the javadoc but I just meant that doesn't go back and possibility that returns the same

This should never happen (mvcc read point should always be > head.readPoint...)?

110 + long mvccReadPoint = getMvccReadPoint();
111 + if (head.readPoint >= mvccReadPoint)
112 + return head.readPoint;

The same thing can be said and always head.readPoint <= mvccReadPoint.

bq. Any suggestions on how to test for correctness?
That's difficult... :(

> HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read
> ----------------------------------------------------------------------------------
>                 Key: HBASE-15716
>                 URL: https://issues.apache.org/jira/browse/HBASE-15716
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>            Reporter: stack
>            Assignee: stack
>         Attachments: 15716.implementation.using.ScannerReadPoints.branch-1.patch, 15716.prune.synchronizations.patch,
15716.prune.synchronizations.v3.patch, 15716.prune.synchronizations.v4.patch, 15716.prune.synchronizations.v4.patch,
15716.wip.more_to_be_done.patch, HBASE-15716.branch-1.001.patch, HBASE-15716.branch-1.002.patch,
HBASE-15716.branch-1.003.patch, HBASE-15716.branch-1.004.patch, HBASE-15716.branch-1.005.patch,
ScannerReadPoints.java, ScannerReadPoints.v2.java, Screen Shot 2016-04-26 at 2.05.45 PM.png,
Screen Shot 2016-04-26 at 2.06.14 PM.png, Screen Shot 2016-04-26 at 2.07.06 PM.png, Screen
Shot 2016-04-26 at 2.25.26 PM.png, Screen Shot 2016-04-26 at 6.02.29 PM.png, Screen Shot 2016-04-27
at 9.49.35 AM.png, Screen Shot 2016-06-30 at 9.52.52 PM.png, Screen Shot 2016-06-30 at 9.54.08
PM.png, TestScannerReadPoints.java, before_after.png, current-branch-1.vs.NoSynchronization.vs.Patch.png,
hits.png, remove.locks.patch, remove_cslm.patch
> Here is a [~lhofhansl] special.
> When we construct the region scanner, we get our read point and then store it with the
scanner instance in a Region scoped CSLM. This is done under a synchronize on the CSLM.
> This synchronize on a region-scoped Map creating region scanners is the outstanding point
of lock contention according to flight recorder (My work load is workload c, random reads).

This message was sent by Atlassian JIRA

View raw message