hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11135) Change region sequenceid generation so happens earlier in the append cycle rather than just before added to file
Date Mon, 12 May 2014 04:21:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994786#comment-13994786

stack commented on HBASE-11135:

[~jeffreyz] Sorry, can you restate the below.  I do not follow:

bq. Stack Have you played the following? Because HRegion#getNextSequenceId() is called during
a flush with updatesLock.writeLock().

What are you saying with the above? (Pardon me)

bq. If we use two ring buffer, HRegion#getNextSequenceId() will return very quickly because
it won't wait for previous work items such as preWAL, postWAL, writer.append() etc.

I can try it.  Looking at it, it just seemed overly-complicated.

It is my sense that the slow down is the hand-off across the ring buffer, the passing back
off the sequence id to the handler, that is holding it up (I did a quick measure where I commented
out the latch code that is in this patch -- i.e. not having the handler wait on the sequence
id -- and I got back the same numbers we currently have in trunk).   If we do double ring
buffer, we will still have to do thread handoff across the first buffer so I foresee us having
the same slow down only w/ a more complex system.  What you think?

Does this patch get you what you wanted and unblocks your mvcc + sequence id unificattion?
 If so, should we commit it?  This patch is getting big.  I can work on perf in a new issue,
a smaller one.

> Change region sequenceid generation so happens earlier in the append cycle rather than
just before added to file
> ----------------------------------------------------------------------------------------------------------------
>                 Key: HBASE-11135
>                 URL: https://issues.apache.org/jira/browse/HBASE-11135
>             Project: HBase
>          Issue Type: Sub-task
>          Components: wal
>            Reporter: stack
>            Assignee: stack
>         Attachments: 11135.wip.txt, 11135v2.txt, 11135v5.txt, 11135v5.txt
> Currently we assign the region edit/sequence id just before we put it in the WAL.  We
do it in the single thread that feeds from the ring buffer.  Doing it at this point, we can
ensure order, that the edits will be in the file in accordance w/ the ordering of the region
sequence id.
> But the point at which region sequence id is assigned an edit is deep down in the WAL
system and there is a lag between our putting an edit into the WAL system and the edit actually
getting its edit/sequence id.
> This lag -- "late-binding" -- complicates the unification of mvcc and region sequence
id, especially around async WAL writes (and, related, for no-WAL writes) -- the parent for
this issue (For async, how you get the edit id in our system when the threads have all gone
home -- unless you make them wait?)
> Chatting w/ Jeffrey Zhong yesterday, we came up with a crazypants means of getting the
region sequence id near-immediately.  We'll run two ringbuffers.  The first will mesh all
handler threads and the consumer will generate ids (we will have order on other side of this
first ring buffer), and then if async or no sync, we will just let the threads return ...
updating mvcc just before we let them go.  All other calls will go up on to the second ring
buffer to be serviced as now (batching, distribution out among the sync'ing threads).  The
first rb will have no friction and should turn at fast rates compared to the second.  There
should not be noticeable slowdown nor do I foresee this refactor intefering w/ our multi-WAL

This message was sent by Atlassian JIRA

View raw message