hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pradheep Shanmugam <Pradheep.Shanmu...@infor.com>
Subject HBase Schema
Date Fri, 18 Nov 2016 16:42:02 GMT

I have table in Hbase which stores multiple versions of data in different rows.
The key is something like  <orgid><doctype><docid><timestamp>. The
timestamp will differ for multiple versions of the same document.
Orgs are skewed say one org may have 1 billion docs while some orgs may have just 100K docs.
So I decided to do salting to spread the write across all region servers and to improve the
Also one more factor for considering salting is these docs will not be referenced after say
6 months and only the new ones will be queried often.

Assuming a hybrid load, will this affect my read(to get the latest version of a document given
the <orgid><doctype><docid>) performance of large and small orgs when there
are more than 10 billion rows in total?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message