hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Drob <md...@apache.org>
Subject Re: [DISCUSS] Plan to avoid backup/restore removal from 2.0
Date Tue, 14 Nov 2017 21:54:20 GMT
On Tue, Nov 14, 2017 at 2:57 PM, Josh Elser <elserj@apache.org> wrote:

> On 11/14/17 3:04 PM, Mike Drob wrote:
>> I don't think the second part of my email ever got addressed.
>> I see  "HBase Backup/Restore Phase 3: Security"[1] resolved as "Later"
>> and claims that it will be implemented in the client, both of which make
>> me
>> uncomfortable. Security Later is a general bad practice, and it is very
>> rarely correct to rely on client-side security for anything.
>>> Is there another issue that covers security? Do we rely completely on
>> HDFS security here for more than just the DistCP? What kind of testing has
>> been done with security, do we have assurances that the backups aren't
>> accidentally exposing tables to the world?
> "Security" as you phrase is pretty open ended, no? The current security
> model is based around the filesystem permissions and the enforcement of an
> HBase superuser to execute the necessary service operations behind the
> BackupAdmin "facade" (e.g. WAL roll procedure execution, snapshot creation,
> snapshot restore, update hbase:backup are the HBase client actions actually
> being performed). That's the state of what it is right now and, yes, it
> does rely on the filesystem backups are sent to (e.g. HDFS, S3, Isilon,
> WASB) are properly secured. We certainly don't want to be testing
> correctness of those systems in HBase.
Yea, it's somewhat open ended. Relying on filesystem enforcement is
probably sufficient for now, and I agree that it is not within out scope to
be testing correctness of their implementation.

> I can see a small section on the documentation update I've already been
> hacking on to include details on the issue "We can't help you secure where
> you put the data". Given how many instances of "globally readable S3
> bucket" I've seen recently, this strikes me as prudent.

I would prefer this to be a giant, hard to miss, red letters, all caps
warning; not a small section. I do think it is our responsibility for
telling users how to configure the backup/restore process for communicating
with secure systems. Or, at a minimum, documenting how we pass arbitrary
configuration options that can then be used to communicate with said

For example, if we support writing backups to S3, then we should have a way
to specify an Auth string and maybe even some of the custom headers like
x-amz-acl. We don't have to explicitly enumerate best practices, but if the
only option is to write to a globally open bucket, then I don't think we
should advertise writing to S3 as an available option.

Similarly, if we tell people that they can send backups to HDFS, then we
should give them the hooks to correctly interface with a kerberized HDFS.

Maybe this is already in the proposed patch, I haven't gone looking yet.

> The final issue then is about the backup containing other table's data --
> somehow a backup would reference data from another table than the one the
> admin intended to access. For full backups, this is out of scope (the full
> backup is relying on Snapshots -- we shouldn't be testing correctness of
> Snapshots via B&R). For incremental backups, specifically when we're
> filtering WALs, this is a concern. Thankfully, it's an analogous problem to
> "correctness". We have unit test coverage in this area already, and we
> should get good coverage in the up-coming integration test.

Again, agree on the general outline of scope you've suggested. Are we
testing the correctness on the backup itself or on a table built from the
restore of that backup? There may be a subtle difference between the two.

I've got some ideas for interesting sequences that would be good to verify,
but need a bit of time to check that I'd be asserting what I think I'm
asserting. Will need a few days to digest and then I should have something
I can concretely point at and ask "what about this?"

> Does that help paint a better picture, Mike? Have I missed or glossed over
> any points?

Yes, this was very helpful. Thanks, Josh.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message