Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EC241770C for ; Wed, 31 Aug 2011 14:42:34 +0000 (UTC) Received: (qmail 94773 invoked by uid 500); 31 Aug 2011 14:42:32 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 94661 invoked by uid 500); 31 Aug 2011 14:42:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 94653 invoked by uid 99); 31 Aug 2011 14:42:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Aug 2011 14:42:31 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.83.44] (HELO mail-gw0-f44.google.com) (74.125.83.44) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Aug 2011 14:42:25 +0000 Received: by gwb20 with SMTP id 20so140581gwb.31 for ; Wed, 31 Aug 2011 07:42:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anconafamily.com; s=google; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:content-type:content-transfer-encoding; bh=fNgRN2E9ffpQtszkms6EO7SUc83u7i9fOgI609BF+ug=; b=OFBFaVeLvR3cBslL8UEDXd22GsXlzV0LFgv6kUo/BD/vNHklu9QQMU+huyoXDeJgk7 TsGoaT8P3HznOKBkFHWoOUWJ/EWPXq/zAWyIUxOpgZXkObFhLax3WNRTKfaY2J81eNXD 7ew0jmPbmKyz9gcs+VfHqUWWFBSYbtIv4AfdQ= MIME-Version: 1.0 Received: by 10.150.133.3 with SMTP id g3mr360831ybd.386.1314801723044; Wed, 31 Aug 2011 07:42:03 -0700 (PDT) Received: by 10.151.85.12 with HTTP; Wed, 31 Aug 2011 07:42:02 -0700 (PDT) X-Originating-IP: [205.207.104.229] In-Reply-To: References: <1827B1B2-1179-412F-9BBD-6FA6068CE49C@gmail.com> <3DF37906-312E-41E9-9698-4D3837000AD4@gmail.com> Date: Wed, 31 Aug 2011 10:42:02 -0400 Message-ID: Subject: Re: Updates lost From: Jim Ancona To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org You could also look at Hector's approach in: https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettypr= int/cassandra/service/clock/MicrosecondsSyncClockResolution.java It works well and I believe there was some performance testing done on it as well. Jim On Tue, Aug 30, 2011 at 3:43 PM, Jeremy Hanna wrote: > Sorry - misread your earlier email. =A0I would login to IRC and ask in #c= assandra. =A0I would think given the nature of nanotime you'll run into har= der to track down problems, but it may be fine. > > On Aug 30, 2011, at 2:06 PM, Jiang Chen wrote: > >> Do you see any problem with my approach to derive the current time in >> nano seconds though? >> >> On Tue, Aug 30, 2011 at 2:39 PM, Jeremy Hanna >> wrote: >>> Yes - the reason why internally Cassandra uses milliseconds * 1000 is b= ecause System.nanoTime javadoc says "This method can only be used to measur= e elapsed time and is not related to any other notion of system or wall-clo= ck time." >>> >>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nano= Time%28%29 >>> >>> On Aug 30, 2011, at 1:31 PM, Jiang Chen wrote: >>> >>>> Indeed it's microseconds. We are talking about how to achieve the >>>> precision of microseconds. One way is System.currentTimeInMillis() * >>>> 1000. It's only precise to milliseconds. If there are more than one >>>> update in the same millisecond, the second one may be lost. That's my >>>> original problem. >>>> >>>> The other way is to derive from System.nanoTime(). This function >>>> doesn't directly return the time since epoch. I used the following: >>>> >>>> =A0 =A0 =A0 private static long nanotimeOffset =3D System.nanoTime() >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 - System.currentTimeMillis= () * 1000000; >>>> >>>> =A0 =A0 =A0 private static long currentTimeNanos() { >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 return System.nanoTime() - nanotimeOffset; >>>> =A0 =A0 =A0 } >>>> >>>> The timestamp to use is then currentTimeNanos() / 1000. >>>> >>>> Anyone sees problem with this approach? >>>> >>>> On Tue, Aug 30, 2011 at 2:20 PM, Edward Capriolo wrote: >>>>> >>>>> >>>>> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna >>>>> wrote: >>>>>> >>>>>> I would not use nano time with cassandra. =A0Internally and througho= ut the >>>>>> clients, milliseconds is pretty much a standard. =A0You can get into= trouble >>>>>> because when comparing nanoseconds with milliseconds as long numbers= , >>>>>> nanoseconds will always win. =A0That bit us a while back when we del= eted >>>>>> something and it couldn't come back because we deleted it with nanos= econds >>>>>> as the timestamp value. >>>>>> >>>>>> See the caveats for System.nanoTime() for why milliseconds is a stan= dard: >>>>>> >>>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#n= anoTime%28%29 >>>>>> >>>>>> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote: >>>>>> >>>>>>> Looks like the theory is correct for the java case at least. >>>>>>> >>>>>>> The default timestamp precision of Pelops is millisecond. Hence the >>>>>>> problem as explained by Peter. Once I supplied timestamps precise t= o >>>>>>> microsecond (using System.nanoTime()), the problem went away. >>>>>>> >>>>>>> I previously stated that sleeping for a few milliseconds didn't hel= p. >>>>>>> It was actually because of the precision of Java Thread.sleep(). >>>>>>> Sleeping for less than 15ms often doesn't sleep at all. >>>>>>> >>>>>>> Haven't checked the Python side to see if it's similar situation. >>>>>>> >>>>>>> Cheers. >>>>>>> >>>>>>> Jiang >>>>>>> >>>>>>> On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen wrot= e: >>>>>>>> It's a single node. Thanks for the theory. I suspect part of it ma= y >>>>>>>> still be right. Will dig more. >>>>>>>> >>>>>>>> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller >>>>>>>> wrote: >>>>>>>>>> The problem still happens with very high probability even when i= t >>>>>>>>>> pauses for 5 milliseconds at every loop. If Pycassa uses microse= conds >>>>>>>>>> it can't be the cause. Also I have the same problem with a Java >>>>>>>>>> client >>>>>>>>>> using Pelops. >>>>>>>>> >>>>>>>>> You connect to localhost, but is that a single node or part of a >>>>>>>>> cluster with RF > 1? If the latter, you need to use QUORUM consis= tency >>>>>>>>> level to ensure that a read sees your write. >>>>>>>>> >>>>>>>>> If it's a single node and not a pycassa / client issue, I don't k= now >>>>>>>>> off hand. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> / Peter Schuller (@scode on twitter) >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>>> Isn't the standard microseconds ? (System.currentTimeMillis()*1000L) >>>>> http://wiki.apache.org/cassandra/DataModel >>>>> The CLI uses microseconds. If your code and the CLI are doing differe= nt >>>>> things with time BadThingsWillHappen TM >>>>> >>>>> >>> >>> > >