hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Help in designing row key
Date Tue, 02 Jul 2013 16:13:25 GMT
Hi to everybody,

in my use case I have to perform batch analysis skipping old data.
For example, I want to process all rows created after a certain timestamp,
passed as parameter.

What is the most effective way to do this?
Should I design my row-key to embed timestamp?
Or just filtering by timestamp of the row is fast as well? Or what else?

Initially I was thinking to compose my key as:
timestamp|source|title|type

but:

1) Using timestamp in row-keys is discouraged
2) If this design is ok, using this approach I still have problems
filtering by timestamp because I cannot found a way to numerically filer
(instead of alphanumerically/by string). Example:
1372776400441|something has timestamp lesser
than 1372778470913|somethingelse but I cannot filter all row whose key is
"numerically" greater than 1372776400441. Is it possible to overcome this
issue?
3) If this design is not ok, should I filter by a simpler row-key plus a
filter on timestamp? Or what else?

Best,
Flavio

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message