hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: Is it necessary to set MD5 on rowkey?
Date Wed, 19 Dec 2012 22:15:25 GMT
On Wed, Dec 19, 2012 at 1:26 PM, David Arthur <mumrah@gmail.com> wrote:

> Let's say you want to decompose a url into domain and path to include in
> your row key.
> You could of course just use the url as the key, but you will see
> hotspotting since most will start with "http".

Doesn't the original Bigtable paper [0] design around this problem by
dropping the protocol and only storing the domain? *goes to check* Yes, it

Personally, I've never encountered an HBase schema design problem where
salting really nailed it. It's an okay place to start with initial designs,
especially if you don't know your data well. I'm a big fan of using the
natural variance in the data itself to solve this problem. OpenTSDB does
this quite well, IMHO. Plus, it's kind of a game or data puzzle -- how to
use the data's nature to your advantage :)

Just my 2ยข


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message