hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Timestamp resolution
Date Sat, 24 May 2014 18:51:35 GMT
I like the idea abstractly, but did the brainstorming include a strategy
for clocking time at sub-millisecond resolution? I am no expert in
timekeeping, but here is what I think I know:

In the datacenter, it's unlikely most operational environments will take
care with their timekeeping such as
http://googleblog.blogspot.com/2011/09/time-technology-and-leaping-seconds.html.
Although some deployments could. +- 1 ms is an ok assumption with NTP
controlled clocks. There will be occasional small jumps in time backwards
or forwards. Jeffrey's suggestion is important.

In the software, System.currentTimeMillis reads wall clock time but at a
granularity subject to the OS timer frequency. On Linux, timekeeping
strategy has changed through kernel evolution, I'm not sure about the
latest. It used to have 1 ms resolution (if HZ=1000). On Windows IIRC you
can tell the OS to enable a wall clock with higher than default granularity
but with a substantial performance impact, I'd guess by radically
increasing timer interrupt frequency. System.nanoTime supports a finer
granularity but is not wall clock time (it reads from the TSC or HPET). The
values returned are nanoseconds from a fixed origin time. Each JVM will
have a different arbitrary origin. Calling System.nanoTime is more
expensive than calling out to native code to issue a 'rdtsc' instruction
directly, but again is not wall time and this exposes issues with TSC
desynchronization between cores and possible cycle counter frequency
variations.



On Fri, May 23, 2014 at 6:43 PM, Jeffrey Zhong <jzhong@hortonworks.com>wrote:

>
> There is one more relevant JIRA. We can keep the MVCC values as each
> Cell's "version" forever so we have clear view about the ordering of
> changes of Cells. In this way, we don't have to worry about clock skew &
> low time resolution OSs.
>
> [BRAINSTORM] Combine MVCC and
> SeqId(https://issues.apache.org/jira/browse/HBASE-8763)
>
> On 5/23/14 5:56 PM, "Enis Söztutar" <enis.soz@gmail.com> wrote:
>
> >Also relevant:
> >
> >https://issues.apache.org/jira/browse/HBASE-8927
> >https://issues.apache.org/jira/browse/HBASE-6833
> >
> >
> >On Fri, May 23, 2014 at 5:54 PM, Enis Söztutar <enis.soz@gmail.com>
> wrote:
> >
> >> +1 to micros. We should do it at the table level rather than CF level?
> >>
> >> How can we get the micros resolution efficiently in java?
> >>
> >> Enis
> >>
> >>
> >> On Fri, May 23, 2014 at 5:27 PM, lars hofhansl <larsh@apache.org>
> wrote:
> >>
> >>> We have discussed this in the past. It just came up again during an
> >>> internal discussion.
> >>> Currently we simply store a Java timestamp (millisec since epoch), i.e.
> >>> we have ms resolution.
> >>>
> >>> We do have 8 bytes for the TS, though. Not enough to store nanosecs
> >>>(that
> >>> would only cover 2^63/10^9/3600/24/365.24 = 292.279 years), but enough
> >>>for
> >>> microseconds (292279 years).
> >>> Should we just store he TS is microseconds? We could do that right now
> >>> (and just keep the ms resolution for now - i.e. the us part would
> >>>always be
> >>> 0 for now).
> >>> Existing data must be in ms of course, so we'd grandfather that in, but
> >>> new tables could store by default in us.
> >>>
> >>> We'd need to make this configurable both the column family level and
> >>> client level, so clients could still opt to see data in ms.
> >>>
> >>> Comments? Too much to bite off?
> >>>
> >>> -- Lars
> >>>
> >>>
> >>
>
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message