hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Subash K <subas...@ericsson.com>
Subject Hbase Snapshot Export Data storage
Date Thu, 21 Dec 2017 04:36:53 GMT
We have a use case to transfer data from one cluster to another cluster. As of now we are using
CopyTable, but it is having impact on region server and it is taking lot of time to complete
data transfer from one to another.

So we are exploring on HBase Export Snapshot feature and we have planned to go ahead with
the below steps.

  1.  Take snapshot of a table in Source
  2.  Execute ExportSnapshot job and send the snapshot to the destination
  3.  Restore the snapshot sent from source.
  4.  Now we are able to access the data.

We want to understand how the data is handled in destination after restoring the snapshot.
Because we can still see the data under /hbase/archive/data directory in HDFS and only reference
data is being maintained in /hbase/data/

Can someone help us to understand

  1.  When the data under /hbase/archive/data will be removed?
  2.  When new data is inserted into the table, where the data will be stored either in /hbase/archive/data
or /hbase/data?
  3.  I tried to delete the snapshot and run major_compaction for the table, the data got
moved from /hbase/archive/data to /hbase /data. So, is major_compaction required always after
restoring snapshot to move the data to its respective data location?
  4.  I'm able to see that data is being stored in archive even if there is no snapshot. Under
what other scenario data will be stored in /hbase/archive/data/ ?

Subash Kunjupillai

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message