cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Why does `now()` produce different times within the same query?
Date Thu, 01 Dec 2016 15:44:10 GMT
On Thu, Dec 1, 2016 at 4:06 AM, Sylvain Lebresne <sylvain@datastax.com>
wrote:

> One can of course always open a JIRA, but I'm going to strongly disagree
> with a
> change here (outside of a documentation one that is).
>
> The now() function is a timeuuid generator, and it thus generates a unique
> timeuuid on every call, as specified by the timeuuid spec. I'll note that
> document lists it under "Timeuuid functions", and has sentences like
> "the value returned by now() is guaranteed to be unique", so while I'm
> sure the
> documentation can be further clarified, I think it's pretty clear it's not
> the
> now() of SQL, and getting unique values on every call shouldn't be *that*
> surprising.
>
> Also, now() was primarily meant for use on timeuuid clustering columns for
> a
> time-series like table, something like:
>   CREATE TABLE ts (
>     k int,
>     t timeuuid,
>     v text,
>     PRIMARY KEY (k, t)
>   )
> and if you use it multiple times in a batch, this would look something
> like:
>   BEGIN BATCH
>     INSERT INTO ts (k, t, v) VALUES (0, now(), 'foo');
>     INSERT INTO ts (k, t, v) VALUES (0, now(), 'bar');
>   APPLY BATCH
> and you definitively want that to insert 2 "events", not just one.
>
> This is also why changing the behavior of this method *would* be a breaking
> change.
>
> Another reason this work the way it is is that functions in CQL are just
> that,
> functions. Each execution is unique and they have no notion of being
> executed in
> the same statement/batch/whatever. I actually think this is sensible,
> assuming
> one stops being obsessed with what other databases that aren't Apache
> Cassandra
> do.
>
> I will note that Ben seems to suggest keeping the return of now() unique
> across
> call while keeping the time component equals, thus varying the rest of the
> uuid
> bytes. However:
>  - I'm starting to wonder what this would buy us. Why would someone be
> super
>    confused by the time changing across calls (in a single
> statement/batch), but
>    be totally not confused by the actual full return to not be equal? And
> how is
>    that actually useful: you're having different result anyway and you're
>    letting the server pick the timestamp in the first place, so you're
> probably
>    not caring about milliseconds precision of that timestamp in the first
> place.
>  - This would basically be a violation of the timeuuid spec
>  - This would be a big pain in the code and make of now() a special case
>     among functions. I'm unconvinced special cases are making things easier
>     in general.
>
> So I'm all for improving the documentation if this confuses users due to
> expectations (mistakenly) carried from prior experiences, and please
> feel free to open a JIRA for that. I'm a lot less in agreement that there
> is
> something wrong with the way the function behave in principle.
>
> > I can see why this issue has been largely ignored and hasn't had a
> chance for
> > the behaviour to be formally defined
>
> Don't make too much assumptions. The behavior is perfectly well defined:
> now()
> is a "normal" function and is evaluated whenever it's called according to
> the
> timeuuid spec (or as close to it as we can make it).
>
> On Thu, Dec 1, 2016 at 7:25 AM, Benjamin Roth <benjamin.roth@jaumo.com>
> wrote:
>
>> Great comment. +1
>>
>> Am 01.12.2016 06:29 schrieb "Ben Bromhead" <ben@instaclustr.com>:
>>
>>> tl;dr +1 yup raise a jira to discuss how now() should behave in a single
>>> statement (and possible extend to batch statements).
>>>
>>> The values of now should be the same if you assume that now() works like
>>> it does in relational databases such as postgres or mysql, however at the
>>> moment it instead works like sysdate() in mysql. Given that CQL is supposed
>>> to be SQL like, I think the assumption around the behaviour of now() was a
>>> fair one to make.
>>>
>>> I definitely agree that raising a jira ticket would be a great place to
>>> discuss what the behaviour of now() should be for Cassandra. Personally I
>>> would be in favour of seeing the deterministic component (the actual time
>>> part) being the same across multiple calls in the one statement or multiple
>>> statements in a batch.
>>>
>>> Cassandra documentation does not make any claims as to how now() works
>>> within a single statement and reading the code it shows the intent is to
>>> work like sysdate() from MySQL rather than now(). One of the identified
>>> dangers of making cql similar to sql is that, while yes it aids adoption,
>>> users will find that SQL like things don't behave as expected. Of course as
>>> a user, one shouldn't have to read the source code to determine correct
>>> behaviour.
>>>
>>> Given that a timeuuid is made up of deterministic and (pseudo)
>>> non-deterministic components I can see why this issue has been largely
>>> ignored and hasn't had a chance for the behaviour to be formally defined
>>> (you would expect now to return the same time in the one statement despite
>>> multiple calls, but you wouldn't expect the same behaviour for say a call
>>> to rand()).
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, 30 Nov 2016 at 19:54 Cody Yancey <yancey@uber.com> wrote:
>>>
>>>>     This is not a bug, and in fact changing it would be a serious bug.
>>>>
>>>> False. Absolutely no consumer would be broken by a change to guarantee
>>>> an identical time component that isn't broken already, for the simple
>>>> reason your code already has to handle that case, as it is in fact the
>>>> majority case RIGHT NOW. Users can hit this bug, in production, because
>>>> unit tests might not experienced it! The time component should be the time
>>>> that the command was processed by the coordinator node.
>>>>
>>>>      would one expect a java/py/bash script that loops
>>>>
>>>> Individual Cassandra writes (which is what OP is referring to
>>>> specifically) are not loops. They are in almost every case atomic
>>>> operations that either succeed completely or fail completely. Allowing a
>>>> single atomic operation to witness multiple times in these corner cases is
>>>> not only surprising, as this thread demonstrates, it is also needlessly
>>>> restricting to what developers can use the database for, and provides NO
>>>> BENEFIT.
>>>>
>>>>     Calling now PRIOR to initiating multiple inserts is in most cases
>>>> exactly what one does...the ONLY practice is to set the value before
>>>> initiating the sequence of calls
>>>>
>>>> Also false. Cassandra does not have a way of doing this on the
>>>> coordinator node rather than the client device, and as I already showed,
>>>> the client device is the wrong place to do it in situations where
>>>> guaranteeing bounded clock-skew actually makes a difference one way or the
>>>> other.
>>>>
>>>> Thanks,
>>>> Cody
>>>>
>>>>
>>>>
>>>> On Wed, Nov 30, 2016 at 8:02 PM, daemeon reiydelle <daemeonr@gmail.com>
>>>> wrote:
>>>>
>>>> This is not a bug, and in fact changing it would be a serious bug.
>>>>
>>>> What it is is a wonderful case of bad coding: would one expect a
>>>> java/py/bash script that loops on a bunch of read/execut/update calls where
>>>> each iteration calls time to return the same exact time for the duration
of
>>>> the execution of the code? Whether the code runs for 5 seconds or 5 hours?
>>>>
>>>> Every call to a system call is unique, including within C*. Calling now
>>>> PRIOR to initiating multiple inserts is in most cases exactly what one does
>>>> to assure unique time stamps FOR THE BATCH OF INSERTS. To get a nearly
>>>> identical system time as would be the uuid of the row, one tries to call
>>>> time as close to just before the insert as possible. Then repeat.
>>>>
>>>> You have a logic issue in your code. If you want the same value for a
>>>> set of calls, the ONLY practice is to set the value before initiating the
>>>> sequence of calls.
>>>>
>>>>
>>>>
>>>> *.......*
>>>>
>>>>
>>>>
>>>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
>>>> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>>>>
>>>> On Wed, Nov 30, 2016 at 6:16 PM, Cody Yancey <yancey@uber.com> wrote:
>>>>
>>>> Getting the same TimeUUID values might be a major problem. Getting two
>>>> different TimeUUIDs that at least have time component would not be a major
>>>> problem as this is the main case today. Getting different time components
>>>> is actually the corner case, and it is a corner case that breaks
>>>> Internet-of-Things applications. We can tightly control clock skew in our
>>>> cluster. We most definitely CANNOT control clock skew on the thousands of
>>>> sensors that write to our cluster.
>>>>
>>>> Thanks,
>>>> Cody
>>>>
>>>> On Wed, Nov 30, 2016 at 5:33 PM, Robert Wille <rwille@fold3.com> wrote:
>>>>
>>>> In my opinion, this is not broken and “fixing” it would break existing
>>>> code. Consider a batch that includes multiple inserts, each of which
>>>> inserts the value returned by now(). Getting the same UUID for each insert
>>>> would be a major problem.
>>>>
>>>> Cheers
>>>>
>>>> Robert
>>>>
>>>>
>>>> On Nov 30, 2016, at 4:46 PM, Todd Fast <todd@digitalexistence.com>
>>>> wrote:
>>>>
>>>> FWIW I'd suggest opening a bug--this behavior is certainly quite
>>>> unexpected and more than just a documentation issue. In general I can't
>>>> imagine any desirable properties of the current implementation, and there
>>>> are likely a bunch of latent bugs sitting out there, so it should be fixed.
>>>>
>>>> Todd
>>>>
>>>> On Wed, Nov 30, 2016 at 12:37 PM Terry Liu <tliu@turnitin.com> wrote:
>>>>
>>>> Sorry for my typo. Obviously, I meant:
>>>> "It appears that a single query that calls Cassandra's`now()` time
>>>> function *multiple times *may actually cause a query to write or
>>>> return different times."
>>>>
>>>> Less of a surprise now that I realize more about the implementation,
>>>> but I agree that more explicit documentation around when exactly the
>>>> "execution" of each now() statement happens and what implications it has
>>>> for the resulting timestamps would be helpful when running into this.
>>>>
>>>> Thanks for the quick responses!
>>>>
>>>> -Terry
>>>>
>>>>
>>>>
>>>> On Tue, Nov 29, 2016 at 2:45 PM, Marko Švaljek <msvaljek@gmail.com>
>>>> wrote:
>>>>
>>>> every now() call in statement is under the hood "replaced" with newly
>>>> generated uuid.
>>>>
>>>> It can happen that they belong to  different milliseconds in time.
>>>>
>>>> If you need to have same timestamps you need to set them on the client
>>>> side.
>>>>
>>>>
>>>> @msvaljek <https://twitter.com/msvaljek>
>>>>
>>>> 2016-11-29 22:49 GMT+01:00 Terry Liu <tliu@turnitin.com>:
>>>>
>>>> It appears that a single query that calls Cassandra's `now()` time
>>>> function may actually cause a query to write or return different times.
>>>>
>>>> Is this the expected or defined behavior, and if so, why does it behave
>>>> like this rather than evaluating `now()` once across an entire statement?
>>>>
>>>> This really affects UPDATE statements but to test it more easily, you
>>>> could try something like:
>>>>
>>>> SELECT toTimestamp(now()) as a, toTimestamp(now()) as b
>>>> FROM keyspace.table
>>>> LIMIT 100;
>>>>
>>>> If you run that a few times, you should eventually see that the
>>>> timestamp returned moves onto the next millisecond mid-query.
>>>>
>>>> --
>>>> *Software Engineer*
>>>> Turnitin - http://www.turnitin.com
>>>> tliu@turnitin.com
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Software Engineer*
>>>> Turnitin - http://www.turnitin.com
>>>> tliu@turnitin.com
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>> Ben Bromhead
>>> CTO | Instaclustr <https://www.instaclustr.com/>
>>> +1 650 284 9692 <+1%20650-284-9692>
>>> Managed Cassandra / Spark on AWS, Azure and Softlayer
>>>
>>
>
I am not sure you saw my reply on thread but I believe everyone's needs can
be met I will copy that here:

"Food for thought: Hive's UDFs introduced an annotation  @UDFType(deterministic
= false)

http://dmtolpeko.com/2014/10/15/invoking-stateful-udf-at-
map-and-reduce-side-in-hive/

The effect is the query planner can see when such a UDF is in use and
determine the value once at the start of a very long query."

Essentially hive had a similar if not identical problem, during a long
running distributed process like map/reduce some users wanted the semantics
of:

1) Each call should have a new timestamps

While other users wanted the semantics of:

2) Each call should generate the same timestamp

The solution implemented was to add an annotation to udf such that the
query planner would pick up the annotation and act accordingly.

(Here is a related issue https://issues.apache.org/jira/browse/HIVE-1986

As a result you can essentially implement two UDFS

@UDFType(deterministic = false)
public class UDFNow

and for the other people

@UDFType(deterministic = true)
public class UDFNowOnce extends UDFNow

Both user cases are met in a sensible way.

Mime
View raw message