hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-11135) Change region sequenceid generation so happens earlier in the append cycle rather than just before added to file
Date Sun, 11 May 2014 05:51:14 GMT

     [ https://issues.apache.org/jira/browse/HBASE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack updated HBASE-11135:

    Attachment: 11135v5.txt

> Change region sequenceid generation so happens earlier in the append cycle rather than
just before added to file
> ----------------------------------------------------------------------------------------------------------------
>                 Key: HBASE-11135
>                 URL: https://issues.apache.org/jira/browse/HBASE-11135
>             Project: HBase
>          Issue Type: Sub-task
>          Components: wal
>            Reporter: stack
>            Assignee: stack
>         Attachments: 11135.wip.txt, 11135v2.txt, 11135v5.txt, 11135v5.txt
> Currently we assign the region edit/sequence id just before we put it in the WAL.  We
do it in the single thread that feeds from the ring buffer.  Doing it at this point, we can
ensure order, that the edits will be in the file in accordance w/ the ordering of the region
sequence id.
> But the point at which region sequence id is assigned an edit is deep down in the WAL
system and there is a lag between our putting an edit into the WAL system and the edit actually
getting its edit/sequence id.
> This lag -- "late-binding" -- complicates the unification of mvcc and region sequence
id, especially around async WAL writes (and, related, for no-WAL writes) -- the parent for
this issue (For async, how you get the edit id in our system when the threads have all gone
home -- unless you make them wait?)
> Chatting w/ Jeffrey Zhong yesterday, we came up with a crazypants means of getting the
region sequence id near-immediately.  We'll run two ringbuffers.  The first will mesh all
handler threads and the consumer will generate ids (we will have order on other side of this
first ring buffer), and then if async or no sync, we will just let the threads return ...
updating mvcc just before we let them go.  All other calls will go up on to the second ring
buffer to be serviced as now (batching, distribution out among the sync'ing threads).  The
first rb will have no friction and should turn at fast rates compared to the second.  There
should not be noticeable slowdown nor do I foresee this refactor intefering w/ our multi-WAL

This message was sent by Atlassian JIRA

View raw message