hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Is there any way to disable WAL while keeping data safety
Date Thu, 26 May 2011 17:14:19 GMT
You can call flush on the table with either the shell or HBaseAdmin
which will persist the Memstore data. What's not so good about this
trick is that if any region server died before you called flush you
need to re-import.

J-D

On Thu, May 26, 2011 at 12:38 AM, Weihua JIANG <weihua.jiang@gmail.com> wrote:
> Hi all,
>
> As I know, WAL is used to ensure the data is safe even if certain RS
> or the whole HBase cluster is down. But, it is anyway a burden on each
> put.
>
> I am wondering: is there any way to disable WAL while keeping data safety.
>
> An ideal solution to me looks like this:
> 1. clients continuely put records with WAL disabled.
> 2. clients call a certain HBase method to ensure all the
> previously-put records are safely stored persistently, then it can
> remove the records at client side.
> 3. on errror, client re-put the maybe-lost records.
>
> Or a slightly different solution is:
> 1. clients continuely put records on HDFS using sequential file.
> 2. clients periodly flush HDFS file and remove the previously put
> records at client side.
> 3. after all records are stored on HDFS, use a map-reduce job to put
> the records into HBase with WAL disabled.
> 4. before each map-reduce task finish, a certain HBase method is
> called to flush the memory data onto HDFS.
> 5. if on error, certain map-reduce task is re-executed (equvalent to
> replay log).
>
> Is there any way to do so in HBase? If no, do you have any plan to
> support such usage model in near future?
>
>
> Thanks
> Weihua
>

Mime
View raw message