Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 252D911C24 for ; Fri, 13 Jun 2014 22:43:37 +0000 (UTC) Received: (qmail 10343 invoked by uid 500); 13 Jun 2014 22:43:36 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 10251 invoked by uid 500); 13 Jun 2014 22:43:36 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 10240 invoked by uid 99); 13 Jun 2014 22:43:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Jun 2014 22:43:36 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lhofhansl@yahoo.com designates 98.139.212.165 as permitted sender) Received: from [98.139.212.165] (HELO nm6.bullet.mail.bf1.yahoo.com) (98.139.212.165) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Jun 2014 22:43:30 +0000 Received: from [98.139.212.151] by nm6.bullet.mail.bf1.yahoo.com with NNFMP; 13 Jun 2014 22:43:09 -0000 Received: from [98.139.212.216] by tm8.bullet.mail.bf1.yahoo.com with NNFMP; 13 Jun 2014 22:43:09 -0000 Received: from [127.0.0.1] by omp1025.mail.bf1.yahoo.com with NNFMP; 13 Jun 2014 22:43:09 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 580021.81242.bm@omp1025.mail.bf1.yahoo.com Received: (qmail 75379 invoked by uid 60001); 13 Jun 2014 22:43:09 -0000 X-YMail-OSG: 5S8xFpUVM1no.JGjSeuDF0joQPxnozksbsqAEFqzTpQJwXv 3w8aILHdDjgWkdASLaFsDiEwJqJfzUfv2.J3NYSpO7PS8ClNoALcaYuhjzvY e1X3iN8WzpxYK6ZBHKrV.xuXVa2ZuYSAyyMK2OxE.2oe7gG.mRElwSFreJ_d GFZRdpNi7o1azBjisiM3dDXLA1gjje1xP_r.q8d7q9_eIzt0PX0QruCN3KdI 7K0qPP8L_ruHI7tEHfNSSA1JidVTsJ02_wu1Gq76B5d5HqNf4H9mKzroSVRF 9Z1dLS.R258VJhx0jm04ueR_4XzvvISheFy069BQY3F6jPL8tpgJNw6I9nIj BlyHhkJUM5A8RsyXsAn8f0Sj1cMzkyaJvHL4CeZp9z14dE96fubCtDqpiPie JnUYW5TQ9Z4my5yB92bWdP1F9oSUgSfA29aEohQKZxZIJk9Ayo0mUWgWVy8n Zpuxh.qsTKtb8_aNOCpT83C3yc3uLQQKQczrsi0jlDc82otuWa4N6bT_C2F8 Yrjc51LYJ2unCR_NpAKy.EiOX3vTomreXIRGtBDJJpjFK4WTMtA-- Received: from [204.14.239.210] by web140602.mail.bf1.yahoo.com via HTTP; Fri, 13 Jun 2014 15:43:09 PDT X-Rocket-MIMEInfo: 002.001,VGhhdCdzIGEgZmFpciBwb2ludC4KSSBhbHNvIHdvdWxkIGxpa2UgdG8gYmUgYWJsZSB0byBtYWtlIHVzZSBvZiBIQmFzZSB0aW1lc3RhbXAgYmFzZWQgb3B0aW1pemF0aW9ucywgc3VjaCBhcyBmaWx0ZXJpbmcgSEZpbGVzLCBldGMuCk1heWJlIHRoYXQgc3RpbGwgd29ya3MgYXMgSEJhc2Ugc2hvdWxkIG5vdCBtYWtlIGFueSBhc3N1bXB0aW9uIGFib3V0IHRoZSBhY3R1YWwgdGltZSBoZXJlLgoKCi0tIExhcnMKCgoKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KIEZyb206IEplZmZyZXkgWmhvbmcBMAEBAQE- X-RocketYMMF: lhofhansl X-Mailer: YahooMailWebService/0.8.190.668 References: <1402551596.49228.YahooMailNeo@web140604.mail.bf1.yahoo.com> Message-ID: <1402699389.28658.YahooMailNeo@web140602.mail.bf1.yahoo.com> Date: Fri, 13 Jun 2014 15:43:09 -0700 (PDT) From: lars hofhansl Reply-To: lars hofhansl Subject: Re: Timestamp resolution To: "dev@hbase.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="-118416272-1032339459-1402699389=:28658" X-Virus-Checked: Checked by ClamAV on apache.org ---118416272-1032339459-1402699389=:28658 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable That's a fair point.=0AI also would like to be able to make use of HBase ti= mestamp based optimizations, such as filtering HFiles, etc.=0AMaybe that st= ill works as HBase should not make any assumption about the actual time her= e.=0A=0A=0A-- Lars=0A=0A=0A=0A________________________________=0A From: Jef= frey Zhong =0ATo: dev@hbase.apache.org =0ASent: Thu= rsday, June 12, 2014 3:01 PM=0ASubject: Re: Timestamp resolution=0A =0A=0A= =0AIn situations such as client "timestamp" is used to support transactions= ,=0Ait's not bad to disable server side TTL because TTL clean up doesn't ca= re=0Aabout transaction semantics. Therefore, it likely breaks data integrit= y=0Aacross multiple tables or regions.=0A=0AIt's still a good use case to s= upport TTL on "client provided timestamp".=0AI think we can provide a plug-= in so that during compaction we allow user=0Ato interpret their own timesta= mp to decide if to GC old data.=0A=0AThanks,=0A-Jeffrey=0A=0AOn 6/11/14 10:= 39 PM, "lars hofhansl" wrote:=0A=0A>The issues you cite = are all orthogonal. We have client/RS time now, we=0A>have clock skew now, = that is completely independent from the time=0A>resolution.=0A>=0A>=0A>I ex= plained the need I saw for this before. Lemme include:=0A>=0A>On Fri, May 2= 3, 2014 at 06:16PM, lars hofhansl wrote:=0A>> The specific discussion here = was a transaction engine doing snapshot=0A>> isolation using the HBase time= stamps, but still be close to wall clock=0A>>time=0A>> as much as possible.= =0A>> In that scenario, with ms resolution you can only do 1000=0A>>transac= tions/sec,=0A>> and so you need to turn the timestamp into something that i= s not wall=0A>>clock=0A>> time as HBase understands it (and hence TTL, etc,= will no longer work,=0A>>as=0A>> well as any other tools you've written th= at use the HBase timestamp).=0A>> 1m transactions/sec are good enough (for = now, I envision in a few years=0A>> we'll be sitting here wondering how we = could ever think that 1m=0A>> transaction/sec are sufficient) :)=0A>> =0A>= =0A>=0A>The point is: Even if you had timestamp oracle (that can resolve ms= and=0A>fill inside ms resolution with a counter), there'd be no way to use= this=0A>as the HBase timestamp while being close to wall clock (so that TT= L, etc,=0A>still works).=0A>So specifically I was not advocating an automat= ic higher time resolution=0A>(as far as I know that cannot be done reliably= in Java across=0A>multiple cores). I was advocating allowing clients with = access to a=0A>(perhaps, but not necessarily single threaded) timestamp ora= cle to store=0A>those timestamps and still make use of all HBase optimizati= on (filtering=0A>HFiles, TTL, etc).=0A>=0A>=0A>-- Lars=0A>=0A>=0A>=0A>_____= ___________________________=0A> From: Michael Segel =0A>To: dev@hbase.apache.org=0A>Cc: lars hofhansl =0A= >Sent: Wednesday, June 11, 2014 2:03 PM=0A>Subject: Re: Timestamp resolutio= n=0A> =0A>=0A>Weirdly enough I find that I have to agree with Andrew.=0A>= =0A>First, how do you get time in units smaller than a ms?=0A>Second clock = skew becomes an issue.=0A>Third, which clock are you using? The client mach= ine? The RS? And then=0A>how do you synchronize each of the RS to be within= a ms of each other?=0A>Correct me if I=C2=B9m wrong but NTP doesn=C2=B9t g= ive that close of a sync.=0A>=0A>Sorry, but really, not a good idea.=0A>=0A= >If you want this=C5=A0 you can store the temporal data as a column.=0A=0A= =0A=0A>=0A>Time really is relative.=0A>=0A>=0A>On May 25, 2014, at 12:53 AM= , Stack wrote:=0A>=0A>> On Fri, May 23, 2014 at 5:27 PM,= lars hofhansl wrote:=0A>> =0A>>> We have discussed this= in the past. It just came up again during an=0A>>> internal discussion.=0A= >>> Currently we simply store a Java timestamp (millisec since epoch),=0A>>= >i.e. we=0A>>> have ms resolution.=0A>>> =0A>>> We do have 8 bytes for the = TS, though. Not enough to store nanosecs=0A>>>(that=0A>>> would only cover = 2^63/10^9/3600/24/365.24 =3D 292.279 years), but enough=0A>>>for=0A>>> micr= oseconds (292279 years).=0A>>> Should we just store he TS is microseconds? = We could do that right now=0A>>> (and just keep the ms resolution for now -= i.e. the us part would=0A>>>always be=0A>>> 0 for now).=0A>>> Existing dat= a must be in ms of course, so we'd grandfather that in, but=0A>>> new table= s could store by default in us.=0A>>> =0A>>> We'd need to make this configu= rable both the column family level and=0A>>> client level, so clients could= still opt to see data in ms.=0A>>> =0A>>> Comments? Too much to bite off?= =0A>>> =0A>>> -- Lars=0A>>> =0A>>> =0A>> I'm a fan.=C2=A0 As Enis cites, HB= ASE-8927 has good discussion.=C2=A0 No=0A>> configuration I'd say.=C2=A0 Ju= st move to the new regime (though I suppose we=0A>> should let you turn it = off).=0A>> =0A>> I think it was Liu Shaohui (IIRC) who made a suggestion th= at had us put=0A>> together ms and nanos under a synchronized block stampin= g the ts on=0A>>Cells=0A>> (left-shift the currentTimeMillis and fill in th= e bottom bytes with as=0A>>much=0A>> of the nanos as fits; i.e. your micros= ).=C2=A0 Rather than nanos/micros, we=0A>> could use a counter instead if a= Cell arrives in the same ms.=C2=A0 Would be=0A>> costly having all ops go = via one code block to get 'time' across cores=0A>>and=0A>> handlers.=0A>> = =0A>> St.Ack=0A=0A=0A=0A-- =0ACONFIDENTIALITY NOTICE=0ANOTICE: This message= is intended for the use of the individual or entity to =0Awhich it is addr= essed and may contain information that is confidential, =0Aprivileged and e= xempt from disclosure under applicable law. If the reader =0Aof this messag= e is not the intended recipient, you are hereby notified that =0Aany printi= ng, copying, dissemination, distribution, disclosure or =0Aforwarding of th= is communication is strictly prohibited. If you have =0Areceived this commu= nication in error, please contact the sender immediately =0Aand delete it f= rom your system. Thank You. ---118416272-1032339459-1402699389=:28658--