hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marko Dinic <hacker.ma...@gmail.com>
Subject Rowkey design
Date Sun, 29 Nov 2015 22:19:11 GMT
Hello, everyone!

I'm new to HBase and I need help designing rowkeys for use case that looks
like this:

- Products are listed, where each product has a product id.
- Each product has a timestamp.
- Each product is created in certain place (e.g. city)
- Each product is created by some unit (e.g. factory)

I would like to be able to scan products from a certain time period on one
hand, from a certain place, or from a certain unit.

I read about salting to avoid hot-spotting and I understand that rows are
sequential by rowkey. This will allow me to scan for a certain time period
using with following rowkey:


And I can specify the period using STARTROW, ENDROW.

What confuses me is how to include place (and maybe unit) into key and be
able to select products from certain place during certain time period?

If I limit myself to be able to scan by one of the above (time range OR
place) I have an idea to duplicate data to two different tables, one with
(salt-productId-timestamp) and other with (salt-productId-place) keys. Is
that recommend or not?

So, how to construct my keys?

I should emphasize that i need this data to be input to MAPREDUCE JOB.

Any help is greatly appreciated.

Best regards,

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message