hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Efficiently wiping out random data?
Date Wed, 19 Jun 2013 12:31:49 GMT
Hey devs,

I was presenting at GOTO Amsterdam yesterday and I got a question
about a scenario that I've never thought about before. I'm wondering
what others think.

How do you efficiently wipe out random data in HBase?

For example, you have a website and a user asks you to close their
account and get rid of the data.

Would you say "sure can do, lemme just issue a couple of Deletes!" and
call it a day? What if you really have to delete the data, not just
mask it, because of contractual obligations or local laws?

Major compacting is the obvious solution but it seems really
inefficient. Let's say you've got some truly random data to delete and
it happens so that you have at least one row per region to get rid
of... then you need to basically rewrite the whole table?

My answer was such, and I told the attendee that it's not an easy use
case to manage in HBase.



View raw message