hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Duxbury <br...@rapleaf.com>
Subject Re: Backup / restore
Date Mon, 25 Feb 2008 18:09:28 GMT
If an offline backup/restore is acceptable, then we already have it.  
All you have to do is copy your hbase rootdir to a new location in  
hdfs, and you've made a backup. You can also use this technique to  
copy one instance to another - just boot up a master pointed at the  
new directory and voila.

As far as dumping to a single file or a group of sql statements, that  
seems like it would be a suboptimal way to manage the amount of data  
you could potentially be working with. At the very least you want  
many files. It also makes sense to keep them in their region  
divisions, otherwise it will be an inordinate amount of work to  
restore into HBase at a later date.

Does this answer your question?


On Feb 25, 2008, at 10:00 AM, Marc Harris wrote:

> There has been discussion before about backup / restore but the
> discussion has tended to fizzle out. I would like to see backup /
> restore functionality for Hbase for the following two purposes:
> 1) Protection against software bugs deleting data. This is not just  
> the
> proverbial namenode gone haywire, but user code running in a map- 
> reduce
> task that deletes the wrong thing could be just as disastrous.
> 2) Ability to copy one Hbase instance's data to another instance. It's
> pretty common in sql-land to run a backup tool that produces a large
> file (either a compact export file, or just a sequence of sql
> statements). This can then be imported to another instance of the db.
> The particular use case I have is that of a production Hbase instance
> and a development or QA instance. It would be useful to be able to  
> dump
> the production instance periodically, and then load it into a
> development instance so that new code could be run against it.
> I think this would be Hbase specific, not a general Hadoop dump /
> restore, because only the logical data should be transferred, not the
> precise structure of how tables are split into regions. Does such as
> thing exist?

View raw message