incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Hanna <jeremy.hanna1...@gmail.com>
Subject Re: Updates lost
Date Tue, 30 Aug 2011 19:43:09 GMT
Sorry - misread your earlier email.  I would login to IRC and ask in #cassandra.  I would think
given the nature of nanotime you'll run into harder to track down problems, but it may be
fine.

On Aug 30, 2011, at 2:06 PM, Jiang Chen wrote:

> Do you see any problem with my approach to derive the current time in
> nano seconds though?
> 
> On Tue, Aug 30, 2011 at 2:39 PM, Jeremy Hanna
> <jeremy.hanna1234@gmail.com> wrote:
>> Yes - the reason why internally Cassandra uses milliseconds * 1000 is because System.nanoTime
javadoc says "This method can only be used to measure elapsed time and is not related to any
other notion of system or wall-clock time."
>> 
>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>> 
>> On Aug 30, 2011, at 1:31 PM, Jiang Chen wrote:
>> 
>>> Indeed it's microseconds. We are talking about how to achieve the
>>> precision of microseconds. One way is System.currentTimeInMillis() *
>>> 1000. It's only precise to milliseconds. If there are more than one
>>> update in the same millisecond, the second one may be lost. That's my
>>> original problem.
>>> 
>>> The other way is to derive from System.nanoTime(). This function
>>> doesn't directly return the time since epoch. I used the following:
>>> 
>>>       private static long nanotimeOffset = System.nanoTime()
>>>                       - System.currentTimeMillis() * 1000000;
>>> 
>>>       private static long currentTimeNanos() {
>>>               return System.nanoTime() - nanotimeOffset;
>>>       }
>>> 
>>> The timestamp to use is then currentTimeNanos() / 1000.
>>> 
>>> Anyone sees problem with this approach?
>>> 
>>> On Tue, Aug 30, 2011 at 2:20 PM, Edward Capriolo <edlinuxguru@gmail.com>
wrote:
>>>> 
>>>> 
>>>> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna <jeremy.hanna1234@gmail.com>
>>>> wrote:
>>>>> 
>>>>> I would not use nano time with cassandra.  Internally and throughout
the
>>>>> clients, milliseconds is pretty much a standard.  You can get into trouble
>>>>> because when comparing nanoseconds with milliseconds as long numbers,
>>>>> nanoseconds will always win.  That bit us a while back when we deleted
>>>>> something and it couldn't come back because we deleted it with nanoseconds
>>>>> as the timestamp value.
>>>>> 
>>>>> See the caveats for System.nanoTime() for why milliseconds is a standard:
>>>>> 
>>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>>> 
>>>>> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:
>>>>> 
>>>>>> Looks like the theory is correct for the java case at least.
>>>>>> 
>>>>>> The default timestamp precision of Pelops is millisecond. Hence the
>>>>>> problem as explained by Peter. Once I supplied timestamps precise
to
>>>>>> microsecond (using System.nanoTime()), the problem went away.
>>>>>> 
>>>>>> I previously stated that sleeping for a few milliseconds didn't help.
>>>>>> It was actually because of the precision of Java Thread.sleep().
>>>>>> Sleeping for less than 15ms often doesn't sleep at all.
>>>>>> 
>>>>>> Haven't checked the Python side to see if it's similar situation.
>>>>>> 
>>>>>> Cheers.
>>>>>> 
>>>>>> Jiang
>>>>>> 
>>>>>> On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <jiangc@gmail.com>
wrote:
>>>>>>> It's a single node. Thanks for the theory. I suspect part of
it may
>>>>>>> still be right. Will dig more.
>>>>>>> 
>>>>>>> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
>>>>>>> <peter.schuller@infidyne.com> wrote:
>>>>>>>>> The problem still happens with very high probability
even when it
>>>>>>>>> pauses for 5 milliseconds at every loop. If Pycassa uses
microseconds
>>>>>>>>> it can't be the cause. Also I have the same problem with
a Java
>>>>>>>>> client
>>>>>>>>> using Pelops.
>>>>>>>> 
>>>>>>>> You connect to localhost, but is that a single node or part
of a
>>>>>>>> cluster with RF > 1? If the latter, you need to use QUORUM
consistency
>>>>>>>> level to ensure that a read sees your write.
>>>>>>>> 
>>>>>>>> If it's a single node and not a pycassa / client issue, I
don't know
>>>>>>>> off hand.
>>>>>>>> 
>>>>>>>> --
>>>>>>>> / Peter Schuller (@scode on twitter)
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>> Isn't the standard microseconds ? (System.currentTimeMillis()*1000L)
>>>> http://wiki.apache.org/cassandra/DataModel
>>>> The CLI uses microseconds. If your code and the CLI are doing different
>>>> things with time BadThingsWillHappen TM
>>>> 
>>>> 
>> 
>> 


Mime
View raw message