cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jiang Chen <jia...@gmail.com>
Subject Re: Updates lost
Date Wed, 31 Aug 2011 14:47:11 GMT
Cheers. That would be another solution.

On Wed, Aug 31, 2011 at 10:42 AM, Jim Ancona <jim@anconafamily.com> wrote:
> You could also look at Hector's approach in:
> https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettyprint/cassandra/service/clock/MicrosecondsSyncClockResolution.java
>
> It works well and I believe there was some performance testing done on
> it as well.
>
> Jim
>
> On Tue, Aug 30, 2011 at 3:43 PM, Jeremy Hanna
> <jeremy.hanna1234@gmail.com> wrote:
>> Sorry - misread your earlier email.  I would login to IRC and ask in #cassandra.
 I would think given the nature of nanotime you'll run into harder to track down problems,
but it may be fine.
>>
>> On Aug 30, 2011, at 2:06 PM, Jiang Chen wrote:
>>
>>> Do you see any problem with my approach to derive the current time in
>>> nano seconds though?
>>>
>>> On Tue, Aug 30, 2011 at 2:39 PM, Jeremy Hanna
>>> <jeremy.hanna1234@gmail.com> wrote:
>>>> Yes - the reason why internally Cassandra uses milliseconds * 1000 is because
System.nanoTime javadoc says "This method can only be used to measure elapsed time and is
not related to any other notion of system or wall-clock time."
>>>>
>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>>
>>>> On Aug 30, 2011, at 1:31 PM, Jiang Chen wrote:
>>>>
>>>>> Indeed it's microseconds. We are talking about how to achieve the
>>>>> precision of microseconds. One way is System.currentTimeInMillis() *
>>>>> 1000. It's only precise to milliseconds. If there are more than one
>>>>> update in the same millisecond, the second one may be lost. That's my
>>>>> original problem.
>>>>>
>>>>> The other way is to derive from System.nanoTime(). This function
>>>>> doesn't directly return the time since epoch. I used the following:
>>>>>
>>>>>       private static long nanotimeOffset = System.nanoTime()
>>>>>                       - System.currentTimeMillis() * 1000000;
>>>>>
>>>>>       private static long currentTimeNanos() {
>>>>>               return System.nanoTime() - nanotimeOffset;
>>>>>       }
>>>>>
>>>>> The timestamp to use is then currentTimeNanos() / 1000.
>>>>>
>>>>> Anyone sees problem with this approach?
>>>>>
>>>>> On Tue, Aug 30, 2011 at 2:20 PM, Edward Capriolo <edlinuxguru@gmail.com>
wrote:
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna <jeremy.hanna1234@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> I would not use nano time with cassandra.  Internally and throughout
the
>>>>>>> clients, milliseconds is pretty much a standard.  You can get
into trouble
>>>>>>> because when comparing nanoseconds with milliseconds as long
numbers,
>>>>>>> nanoseconds will always win.  That bit us a while back when
we deleted
>>>>>>> something and it couldn't come back because we deleted it with
nanoseconds
>>>>>>> as the timestamp value.
>>>>>>>
>>>>>>> See the caveats for System.nanoTime() for why milliseconds is
a standard:
>>>>>>>
>>>>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>>>>>
>>>>>>> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:
>>>>>>>
>>>>>>>> Looks like the theory is correct for the java case at least.
>>>>>>>>
>>>>>>>> The default timestamp precision of Pelops is millisecond.
Hence the
>>>>>>>> problem as explained by Peter. Once I supplied timestamps
precise to
>>>>>>>> microsecond (using System.nanoTime()), the problem went away.
>>>>>>>>
>>>>>>>> I previously stated that sleeping for a few milliseconds
didn't help.
>>>>>>>> It was actually because of the precision of Java Thread.sleep().
>>>>>>>> Sleeping for less than 15ms often doesn't sleep at all.
>>>>>>>>
>>>>>>>> Haven't checked the Python side to see if it's similar situation.
>>>>>>>>
>>>>>>>> Cheers.
>>>>>>>>
>>>>>>>> Jiang
>>>>>>>>
>>>>>>>> On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <jiangc@gmail.com>
wrote:
>>>>>>>>> It's a single node. Thanks for the theory. I suspect
part of it may
>>>>>>>>> still be right. Will dig more.
>>>>>>>>>
>>>>>>>>> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
>>>>>>>>> <peter.schuller@infidyne.com> wrote:
>>>>>>>>>>> The problem still happens with very high probability
even when it
>>>>>>>>>>> pauses for 5 milliseconds at every loop. If Pycassa
uses microseconds
>>>>>>>>>>> it can't be the cause. Also I have the same problem
with a Java
>>>>>>>>>>> client
>>>>>>>>>>> using Pelops.
>>>>>>>>>>
>>>>>>>>>> You connect to localhost, but is that a single node
or part of a
>>>>>>>>>> cluster with RF > 1? If the latter, you need to
use QUORUM consistency
>>>>>>>>>> level to ensure that a read sees your write.
>>>>>>>>>>
>>>>>>>>>> If it's a single node and not a pycassa / client
issue, I don't know
>>>>>>>>>> off hand.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> / Peter Schuller (@scode on twitter)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Isn't the standard microseconds ? (System.currentTimeMillis()*1000L)
>>>>>> http://wiki.apache.org/cassandra/DataModel
>>>>>> The CLI uses microseconds. If your code and the CLI are doing different
>>>>>> things with time BadThingsWillHappen TM
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
>

Mime
View raw message