hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Dechoux <decho...@gmail.com>
Subject Re: What is the best value to be used in rowkey
Date Sat, 22 Sep 2012 14:47:02 GMT
I would say it depends on your context. Like you say, your 'primary key'
should be distinct for two different records.
Even if you are using a hash in addition to the timestamp, you can not
garante that a record won't be overwritten.
If you have something in your context that could act as an identifier you
should use it, else you need to create it.
If you know a given timestamp will be loaded from a single source, adding a
counter could do the trick.
If this is not the case, you could append information at the timestamp so
that you have a single source for a given pre-key value and you could then
append a counter (which should only provide a distinct value for each
instances having the same pre-key value).

I would love to hear about alternatives, though.



On Sat, Sep 22, 2012 at 4:29 PM, Ramasubramanian Narayanan <
ramasubramanian.narayanan@gmail.com> wrote:

> Hi,
> Can anyone suggest what is the best value that can be used for a rowkey in
> a hbase table which will not produce duplicate any point of time. For
> example timestamp with nano seconds may get duplicated if we are loading in
> a batch file.
> regards,
> Rams

Bertrand Dechoux

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message