hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anil Gupta <anilgupt...@gmail.com>
Subject Re: Rename tables or swap alias
Date Tue, 16 Feb 2016 02:21:07 GMT
I dont think there is any atomic operations in hbase to support ddl across 2 tables.

But, maybe you can use hbase snapshots.
1.Create a hbase snapshot.
2.Truncate the table.
3.Write data to the table.
4.Create a table from snapshot taken in step #1 as table_old.

Now you have two tables. One with current run data and other with last run data.
I think above process will suffice. But, keep in mind that it is not atomic.

Sent from my iPhone

> On Feb 15, 2016, at 4:25 PM, Pat Ferrel <pat@occamsmachete.com> wrote:
> Any other way to do what I was asking. With Spark this is a very normal thing to treat
a table as immutable and create another to replace the old.
> Can you lock two tables and rename them in 2 actions then unlock in a very short period
of time?
> Or an alias for table names?
> Didn’t see these in any docs or Googling, any help is appreciated. Writing all this
data back to the original table would be a huge load on a table being written to by external
processes and therefore under large load to begin with.
>> On Feb 14, 2016, at 5:03 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> There is currently no native support for renaming two tables in one atomic
>> action.
>> FYI
>>> On Sun, Feb 14, 2016 at 4:18 PM, Pat Ferrel <pat@occamsmachete.com> wrote:
>>> I use Spark to take an old table, clean it up to create an RDD of cleaned
>>> data. What I’d like to do is write all of the data to a new table in HBase,
>>> then rename the table to the old name. If possible it could be done by
>>> changing an alias to point to the new table as long as all external code
>>> uses the alias, or by a 2 table rename operation. But I don’t see how to do
>>> this for HBase. I am dealing with a lot of data so don’t want to do table
>>> modifications with deletes and upserts, this would be incredibly slow.
>>> Furthermore I don’t want to disable the table for more than a tiny span of
>>> time.
>>> Is it possible to have 2 tables and rename both in an atomic action, or
>>> change some alias to point to the new table in an atomic action. If not
>>> what is the quickest way to achieve this to minimize time disabled.

View raw message