hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jg...@fb.com>
Subject RE: question on indexes in RDBMS vs. noSQL self created indexes...(disk space wise)
Date Wed, 22 Dec 2010 01:44:36 GMT

> 1.       It's a column based sparse table so null's take up no space(ie.
> More room when we need to duplicate)

Correct.  Nulls take up no space.

> 2.       Indexes take up space in an RDBMS already and are essentially
> duplication in your old RDBMS anyways

Secondary indexes in an RDBMS use additional space.  Primary indexes may not depending on
the db.

> 3.       The designs will be quite a bit different eliminating the need
> for those indexes(maybe we only have 3 later out of the 7, and the indexes
> in hbase are a bit bigger than indexes in the old RDBMS too???)

Designs will most likely be different.  Number of indexes may not be the same.  Hard to say
more without knowing the specifics.

Hard to say what will be bigger where.  HBase "indexes" (really just tables) are generally
highly compressible.  This is generally not the case for RDBMS indexes.

An additional point about HBase vs. RDBMS when talking about disk space is that HBase will
work just fine on regular 7.2k RPM drives whereas good performance from RDBMS indexes often
require higher end 15k RPM drives (cost per gigabyte is MUCH higher on these drives).

> Thanks for any feedback here
> Dean
> This message and any attachments are intended only for the use of the
> addressee and may contain information that is privileged and confidential. If
> the reader of the message is not the intended recipient or an authorized
> representative of the intended recipient, you are hereby notified that any
> dissemination of this communication is strictly prohibited. If you have
> received this communication in error, please notify us immediately by e-mail
> and delete the message and any attachments from your system.

View raw message