hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: Hot (consistent) incremental backup
Date Thu, 02 Jul 2015 19:59:51 GMT
Hi, Nicola

I recommend you to read HBASE-7912 design doc (it has been updated today).


On Thu, Jul 2, 2015 at 11:46 AM, Nicola Ferraro <nibbio84@gmail.com> wrote:

> HBase has many options for performing the backup of data stored in a table.
> The "export" tool is described by O'Reilly (HBase, the definitive guide),
> but also here [
> http://blog.cloudera.com/blog/2013/11/approaches-to-backup-and-disaster-recovery-in-hbase/comment-page-1/#comment-63294
> ]
> as a way to perform hot and incremental backups on a table.
> Essentially, the procedure consists in:
> - performing the backup from tome 0 to time t1
> - performing the backup from tome t1 to time t2
> - ... and so on
> Suppose we want to perform a incremental backup from t1 to t2.
> Obviously the backup will start at a time t3 greater or equals to t2 and
> finish at time t4.
> An export-backup is a MapReduce job that essentially queries HBase in order
> to retrieve data updated from time t1 to t2.
> Now, suppose that a client starts writing a particular cell right before t2
> and updates it continuously with a different value every second.
> Fresh data is written to WAL (not checked by the export tool) and memstore
> only, so, every time the client writes a different cell value, the old data
> is lost (assuming we are not using data versioning).
> This means that, if the clients overwrite the cell after t2 but before t3,
> the backup process will not export a consistent snapshot made at time t2,
> instead, the backup will contain the fresh data written after t2. This
> could happen also with data written by the client after t3 and before t4
> (i.e. when the backup is in progress).
> In order to make the incremental (consistent) backup work, I see two
> options:
> - Enable (infinite) version history on every data written to HBase (to
> avoid overriding in memstore)
> - Disable compaction temporarily, force memstore flush (eg. with a
> "snapshot" command), perform the backup with t2 being the snapshot time,
> then re-enable compaction.
> I don't know if the second option is feasible as I did not find a way to
> disable compaction temporarily.
> Is there any other, reliable, feasible option to execute hot +
> consistent + incremental backups with HBase?
> Nicola

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message