hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Buntu Dev <buntu...@gmail.com>
Subject HBase schema design
Date Thu, 27 Aug 2015 18:58:47 GMT
I'm planning on writing a time series of user action events including user
profile, attributes and product purchase transactions to answer these

- What are the events leading up to the users conversion ie, purchase?
- What the different attributes that changed over a given time period?
- What is the LTV of a given user?
- Retrieve list of attributes set/enabled for given user at some point in

As a newbie to HBase, I wanted to confirm that tall table design ie, with
row key <userid>_<timestamp> is _not_ the right design due to these reasons:

* scanning for the latest state of user seems to be an expensive operation
since not all the columns will be available in the latest event for the user

* constructing a row key always requires timestamp to the appended if I'm
not using the regex filtering

* fetching the user at some point in time t1 involves fetching all the
"<userid>*" rows and looking up the row with timestamp <= t1

Are these valid concerns?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message