cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Ancona <...@anconafamily.com>
Subject Re: Updates lost
Date Wed, 31 Aug 2011 14:42:02 GMT
You could also look at Hector's approach in:
https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettyprint/cassandra/service/clock/MicrosecondsSyncClockResolution.java

It works well and I believe there was some performance testing done on
it as well.

Jim

On Tue, Aug 30, 2011 at 3:43 PM, Jeremy Hanna
<jeremy.hanna1234@gmail.com> wrote:
> Sorry - misread your earlier email.  I would login to IRC and ask in #cassandra.  I
would think given the nature of nanotime you'll run into harder to track down problems, but
it may be fine.
>
> On Aug 30, 2011, at 2:06 PM, Jiang Chen wrote:
>
>> Do you see any problem with my approach to derive the current time in
>> nano seconds though?
>>
>> On Tue, Aug 30, 2011 at 2:39 PM, Jeremy Hanna
>> <jeremy.hanna1234@gmail.com> wrote:
>>> Yes - the reason why internally Cassandra uses milliseconds * 1000 is because
System.nanoTime javadoc says "This method can only be used to measure elapsed time and is
not related to any other notion of system or wall-clock time."
>>>
>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>
>>> On Aug 30, 2011, at 1:31 PM, Jiang Chen wrote:
>>>
>>>> Indeed it's microseconds. We are talking about how to achieve the
>>>> precision of microseconds. One way is System.currentTimeInMillis() *
>>>> 1000. It's only precise to milliseconds. If there are more than one
>>>> update in the same millisecond, the second one may be lost. That's my
>>>> original problem.
>>>>
>>>> The other way is to derive from System.nanoTime(). This function
>>>> doesn't directly return the time since epoch. I used the following:
>>>>
>>>>       private static long nanotimeOffset = System.nanoTime()
>>>>                       - System.currentTimeMillis() * 1000000;
>>>>
>>>>       private static long currentTimeNanos() {
>>>>               return System.nanoTime() - nanotimeOffset;
>>>>       }
>>>>
>>>> The timestamp to use is then currentTimeNanos() / 1000.
>>>>
>>>> Anyone sees problem with this approach?
>>>>
>>>> On Tue, Aug 30, 2011 at 2:20 PM, Edward Capriolo <edlinuxguru@gmail.com>
wrote:
>>>>>
>>>>>
>>>>> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna <jeremy.hanna1234@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> I would not use nano time with cassandra.  Internally and throughout
the
>>>>>> clients, milliseconds is pretty much a standard.  You can get into
trouble
>>>>>> because when comparing nanoseconds with milliseconds as long numbers,
>>>>>> nanoseconds will always win.  That bit us a while back when we deleted
>>>>>> something and it couldn't come back because we deleted it with nanoseconds
>>>>>> as the timestamp value.
>>>>>>
>>>>>> See the caveats for System.nanoTime() for why milliseconds is a standard:
>>>>>>
>>>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>>>>
>>>>>> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:
>>>>>>
>>>>>>> Looks like the theory is correct for the java case at least.
>>>>>>>
>>>>>>> The default timestamp precision of Pelops is millisecond. Hence
the
>>>>>>> problem as explained by Peter. Once I supplied timestamps precise
to
>>>>>>> microsecond (using System.nanoTime()), the problem went away.
>>>>>>>
>>>>>>> I previously stated that sleeping for a few milliseconds didn't
help.
>>>>>>> It was actually because of the precision of Java Thread.sleep().
>>>>>>> Sleeping for less than 15ms often doesn't sleep at all.
>>>>>>>
>>>>>>> Haven't checked the Python side to see if it's similar situation.
>>>>>>>
>>>>>>> Cheers.
>>>>>>>
>>>>>>> Jiang
>>>>>>>
>>>>>>> On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <jiangc@gmail.com>
wrote:
>>>>>>>> It's a single node. Thanks for the theory. I suspect part
of it may
>>>>>>>> still be right. Will dig more.
>>>>>>>>
>>>>>>>> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
>>>>>>>> <peter.schuller@infidyne.com> wrote:
>>>>>>>>>> The problem still happens with very high probability
even when it
>>>>>>>>>> pauses for 5 milliseconds at every loop. If Pycassa
uses microseconds
>>>>>>>>>> it can't be the cause. Also I have the same problem
with a Java
>>>>>>>>>> client
>>>>>>>>>> using Pelops.
>>>>>>>>>
>>>>>>>>> You connect to localhost, but is that a single node or
part of a
>>>>>>>>> cluster with RF > 1? If the latter, you need to use
QUORUM consistency
>>>>>>>>> level to ensure that a read sees your write.
>>>>>>>>>
>>>>>>>>> If it's a single node and not a pycassa / client issue,
I don't know
>>>>>>>>> off hand.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> / Peter Schuller (@scode on twitter)
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>> Isn't the standard microseconds ? (System.currentTimeMillis()*1000L)
>>>>> http://wiki.apache.org/cassandra/DataModel
>>>>> The CLI uses microseconds. If your code and the CLI are doing different
>>>>> things with time BadThingsWillHappen TM
>>>>>
>>>>>
>>>
>>>
>
>

Mime
View raw message