hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kannan Muthukkaruppan <Kan...@facebook.com>
Subject row level atomicity
Date Wed, 03 Mar 2010 21:36:56 GMT
The flow during a HRegionServer.put() seems to be the following. [For now, let's just consider
single row Put containing edits to multiple column families/columns.]

HRegionServer.put() does a:

1)      HRegion.put()

2)      syncWal()  (the HDFS sync call).

HRegion.put() does a:
  For each column family {
     HLog.append(all edits to the colum family);
     Write all edits to Memstore;

HLog.append() does a :
  Foreach edit in a single column family {

doWrite() does a:


(i)                  It looks like make several calls to this.write.append() which in turn
does a bunch of individual out.write (to the DFSOutputStream), as opposed to just one interaction
with the underlying DFS. If so, how do we guarantee that all the edits either make it to HDFS
or not atomically? Or is this just broken?

(ii)                The updates to memstore should happen after the sync rather than before,
correct? Otherwise, there is the danger that the write to DFS (sync fails for some reason)
& we return an error to the client, but we have already taken edits to the memstore. So
subsequent reads could serve uncommitted data.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message