hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <bus...@cloudera.com>
Subject Re: Standalone == Dev Only?
Date Fri, 13 Mar 2015 20:44:26 GMT
On Fri, Mar 13, 2015 at 2:41 PM, Michael Segel <michael_segel@hotmail.com>
wrote:

>
> In stand alone, you’re writing to local disk. You lose the disk you lose
> the data, unless of course you’ve raided your drives.
> Then when you lose the node, you lose the data because its not being
> replicated. While this may not be a major issue or concern… you have to be
> aware of it’s potential.
>
>
It sounds like he has this issue covered via VM imaging.



> The other issue when it comes to security, HBase relies on the cluster’s
> security.
> To be clear, HBase relies on the cluster and the use of Kerberos to help
> with authentication.  So that only those who have the rights to see the
> data can actually have access to it.
>
>

He can get around this by relying on the Thrift or REST services to act an
an arbitrator, or he could make his own. So long as he separates access to
the underlying cluster / hbase apis from whatever does exposing the data,
this shouldn't be a problem.



> Then you have to worry about auditing. With respect to HBase, out of the
> box, you don’t have any auditing.
>
>

HBase has auditing. By default it is disabled and it certainly could use
some improvement. Documentation would be a good start. I'm sure the
community would be happy to work with Joseph to close whatever gap he needs.




> You also don’t have built in encryption.
> You can do it, but then you have a bit of work ahead of you.
> Cell level encryption? Accumulo?
>
>
HBase as had encryption since within the 0.98 line. It is stable now in the
1.0 release line. HDFS also supports encryption, though I'm sure using it
with the LocalFileSystem would benefit from testing. There are vendors that
can help with integration with proper key servers, if that is something
Joseph needs and doesn't want to do on his own.

Accumulo does not do cell level encryption.



> There’s definitely more to it.
>
> But the one killer thing… you need to be HIPPA compliant and the simplest
> way to do this is to use a real RDBMS. If you need extensibility, look at
> IDS from IBM (IBM bought Informix ages ago.)
>
> I think based on the size of your data… you can get away with the free
> version, and even if not, IBM does do discounts with Universities and could
> even sponsor research projects.
>
> I don’t know your data, but 10^6 rows is still small.
>
> The point I’m trying to make is that based on what you’ve said, HBase is
> definitely not the right database for you.
>
>
We haven't heard what the target data set size is. If Joseph has reason to
believe that it will be big enough to warrant something like HBase (e.g.
10s of billions of rows), I think there's merit to his argument for
starting with HBase. Single node use cases are definitely not something
we've covered well to date, but it would probably help our overall
usability story to do so.


-- 
Sean

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message