hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Timestamp resolution
Date Mon, 16 Jun 2014 18:24:26 GMT
On Thu, Jun 12, 2014 at 11:30 AM, Michael Segel <michael_segel@hotmail.com>
wrote:

> From what I could see Joda time is ms not microseconds. Of course that was
> from a couple of years ago. Nothing pop’d out that their website itself.
>
> Again, when you start to get below the ms timestamp. You need to be a bit
> careful on relativity.
>

Even above the millisecond scale, you can't rely on wall clock measurements
directly either. For example, right now, my ntp-synchronized laptop is
estimating 3.1ms of error with a max error bound of 139ms. Anyone using the
timestamp component alone as a way to determine event ordering is already
fooling themself.


>
> What does that timestamp really mean?
>
> Lets walk through your example and then we can see why we’re talking past
> ourselves.
>
> Time really is relative.
>
> If you’re looking at sensor data, then its not a TS but the temporal data
> is an element of the record and not the metadata.
>

I think folks already answered this - in the use case at hand, it's not
actually a time measurement, but rather a number generated by a timestamp
oracle.

-Todd


>
>
> On Jun 12, 2014, at 7:13 PM, Todd Lipcon <todd@cloudera.com> wrote:
>
> > The OS has it. Here's an implementation from one of my C++ projects:
> >
> > // Returns the time since the Epoch measured in microseconds.
> > inline MicrosecondsInt64 GetCurrentTimeMicros() {
> >  timespec ts;
> >  clock_gettime(CLOCK_REALTIME, &ts);
> >  return ts.tv_sec * 1000000 + ts.tv_nsec / 1000;
> > }
> >
> > Whether it's trivially available from Java, I'm not sure. But I seem to
> > recall that JodaTime has it, no?
> >
> > -Todd
> >
> >
> > On Thu, Jun 12, 2014 at 11:06 AM, Michael Segel <
> michael_segel@hotmail.com>
> > wrote:
> >
> >> Silly question.
> >> How do you get time in microseconds?
> >>
> >>
> >> On Jun 12, 2014, at 2:56 PM, Andrew Purtell <apurtell@apache.org>
> wrote:
> >>
> >>> Thank you for the clarification, this is what threw me (the initial
> >> mail):
> >>>
> >>>> We do have 8 bytes for the TS, though. Not enough to store nanosecs
> >> (that
> >>> would only cover 2^63/10^9/3600/24/365.24 = 292.279 years), but enough
> >> for
> >>> microseconds (292279 years).
> >>> Should we just store he TS is microseconds? We could do that right now
> >> (and
> >>> just keep the ms resolution for now - i.e. the us part would always be
> 0
> >>> for now).
> >>>
> >>>
> >>> This says we might want to store a timestamp representation that can
> >> handle
> >>> microsecond resolution. My next step was to wonder about the
> availability
> >>> and practicality of microsecond resolution clocks. I don't take
> Michael's
> >>> position of "don't".
> >>>
> >>>
> >>>
> >>> On Wed, Jun 11, 2014 at 10:39 PM, lars hofhansl <larsh@apache.org>
> >> wrote:
> >>>
> >>>> The issues you cite are all orthogonal. We have client/RS time now,
we
> >>>> have clock skew now, that is completely independent from the time
> >>>> resolution.
> >>>>
> >>>>
> >>>> I explained the need I saw for this before. Lemme include:
> >>>>
> >>>> On Fri, May 23, 2014 at 06:16PM, lars hofhansl wrote:
> >>>>> The specific discussion here was a transaction engine doing snapshot
> >>>>> isolation using the HBase timestamps, but still be close to wall
> clock
> >>>> time
> >>>>> as much as possible.
> >>>>> In that scenario, with ms resolution you can only do 1000
> >>>> transactions/sec,
> >>>>> and so you need to turn the timestamp into something that is not
wall
> >>>> clock
> >>>>> time as HBase understands it (and hence TTL, etc, will no longer
> work,
> >> as
> >>>>> well as any other tools you've written that use the HBase timestamp).
> >>>>> 1m transactions/sec are good enough (for now, I envision in a few
> years
> >>>>> we'll be sitting here wondering how we could ever think that 1m
> >>>>> transaction/sec are sufficient) :)
> >>>>>
> >>>>
> >>>>
> >>>> The point is: Even if you had timestamp oracle (that can resolve ms
> and
> >>>> fill inside ms resolution with a counter), there'd be no way to use
> >> this as
> >>>> the HBase timestamp while being close to wall clock (so that TTL, etc,
> >>>> still works).
> >>>> So specifically I was not advocating an automatic higher time
> resolution
> >>>> (as far as I know that cannot be done reliably in Java across
> >>>> multiple cores). I was advocating allowing clients with access to a
> >>>> (perhaps, but not necessarily single threaded) timestamp oracle to
> store
> >>>> those timestamps and still make use of all HBase optimization
> (filtering
> >>>> HFiles, TTL, etc).
> >>>>
> >>>>
> >>>> -- Lars
> >>>>
> >>>>
> >>>>
> >>>> ________________________________
> >>>> From: Michael Segel <michael_segel@hotmail.com>
> >>>> To: dev@hbase.apache.org
> >>>> Cc: lars hofhansl <larsh@apache.org>
> >>>> Sent: Wednesday, June 11, 2014 2:03 PM
> >>>> Subject: Re: Timestamp resolution
> >>>>
> >>>>
> >>>> Weirdly enough I find that I have to agree with Andrew.
> >>>>
> >>>> First, how do you get time in units smaller than a ms?
> >>>> Second clock skew becomes an issue.
> >>>> Third, which clock are you using? The client machine? The RS? And then
> >> how
> >>>> do you synchronize each of the RS to be within a ms of each other?
> >>>> Correct me if I’m wrong but NTP doesn’t give that close of a sync.
> >>>>
> >>>> Sorry, but really, not a good idea.
> >>>>
> >>>> If you want this… you can store the temporal data as a column.
> >>>>
> >>>> Time really is relative.
> >>>>
> >>>>
> >>>> On May 25, 2014, at 12:53 AM, Stack <stack@duboce.net> wrote:
> >>>>
> >>>>> On Fri, May 23, 2014 at 5:27 PM, lars hofhansl <larsh@apache.org>
> >> wrote:
> >>>>>
> >>>>>> We have discussed this in the past. It just came up again during
an
> >>>>>> internal discussion.
> >>>>>> Currently we simply store a Java timestamp (millisec since epoch),
> >> i.e.
> >>>> we
> >>>>>> have ms resolution.
> >>>>>>
> >>>>>> We do have 8 bytes for the TS, though. Not enough to store nanosecs
> >>>> (that
> >>>>>> would only cover 2^63/10^9/3600/24/365.24 = 292.279 years),
but
> enough
> >>>> for
> >>>>>> microseconds (292279 years).
> >>>>>> Should we just store he TS is microseconds? We could do that
right
> now
> >>>>>> (and just keep the ms resolution for now - i.e. the us part
would
> >>>> always be
> >>>>>> 0 for now).
> >>>>>> Existing data must be in ms of course, so we'd grandfather that
in,
> >> but
> >>>>>> new tables could store by default in us.
> >>>>>>
> >>>>>> We'd need to make this configurable both the column family level
and
> >>>>>> client level, so clients could still opt to see data in ms.
> >>>>>>
> >>>>>> Comments? Too much to bite off?
> >>>>>>
> >>>>>> -- Lars
> >>>>>>
> >>>>>>
> >>>>> I'm a fan.  As Enis cites, HBASE-8927 has good discussion.  No
> >>>>> configuration I'd say.  Just move to the new regime (though I suppose
> >> we
> >>>>> should let you turn it off).
> >>>>>
> >>>>> I think it was Liu Shaohui (IIRC) who made a suggestion that had
us
> put
> >>>>> together ms and nanos under a synchronized block stamping the ts
on
> >> Cells
> >>>>> (left-shift the currentTimeMillis and fill in the bottom bytes with
> as
> >>>> much
> >>>>> of the nanos as fits; i.e. your micros).  Rather than nanos/micros,
> we
> >>>>> could use a counter instead if a Cell arrives in the same ms.  Would
> be
> >>>>> costly having all ops go via one code block to get 'time' across
> cores
> >>>> and
> >>>>> handlers.
> >>>>>
> >>>>> St.Ack
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best regards,
> >>>
> >>>  - Andy
> >>>
> >>> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> >>> (via Tom White)
> >>
> >>
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message