hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Client Side buffering vs WAL
Date Tue, 07 Sep 2010 14:44:55 GMT


Came across a problem that I need to walk through.

On the client side, when you instantiate an HTable object, you can specify HTable.setAutoFlush(true/false).
 Setting the boolean value to true means that when you execute a put(), the write is not buffered
on the client and will be written directly to HBase. This overrides the client side buffering
that you can set in your configuration files.

While for many applications its ok for the app to buffer up its writes, however there's a
set of apps where you don't want to do this. That is when your app writes a record to HBase,
you want it exposed ASAP.

On the server side, you have the Write Ahead Log.

If I understand the WAL, it abstracts the actual process of writing to disk so that as far
as your application is concerned, when you write to the WAL, its in HBase.

So, my question is how long does it take for a record in the WAL to be written to Disk?

Also if a record is in the WAL, if I did a get() will the record be found?

Its possible that in a m/r job that client side buffering could mean that it could take a
relatively 'long' time to actually have a record written to HBase, where as once the record
is written to the WAL, it should be consistent in the time it takes to be written to disk
for access by other HBase apps.

Or what am I missing?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message