impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Youwei Wang (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3504: UDF for current timestamp in UTC
Date Fri, 23 Sep 2016 07:30:47 GMT
Youwei Wang has posted comments on this change.

Change subject: IMPALA-3504: UDF for current timestamp in UTC
......................................................................


Patch Set 4:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/4490/3//COMMIT_MSG
Commit Message:

Line 10: as a TIMESTAMP value. Generated timestamp has been verified using
> Please try to break paragraphs close the red line, and put two line breaks 
Done


PS3, Line 11: ine time services.
> What do you mean by this?
Done


http://gerrit.cloudera.org:8080/#/c/4490/3/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 4117:       TYPE_TIMESTAMP));
> Please check that this relates to now() in the way that you expect.
Done


http://gerrit.cloudera.org:8080/#/c/4490/3/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

Line 317: TimestampVal TimestampFunctions::Now(FunctionContext* context) {
> It looks to me like this is now only used one place. Can you inline it ther
Greetings, Jim. 
Please allow me to clarify your intention. I guess you want me to remove both the declaration
and implementation of this TimestampFunctions::Now from timestamp-functions.h and timestamp-functions-ir.cc
and then embed such code:

const TimestampValue* now = context->impl()->state()->now();
TimestampVal return_val;
now->ToTimestampVal(&return_val);
return return_val;

to the place which you think is the only caller of this Now() function?

If so, I am afraid there is another reason forbidding me to do so: since there exist one implicit
caller in the impala_functions.py file:
[['now', 'current_timestamp'], 'TIMESTAMP', [], '_ZN6impala18TimestampFunctions3NowEPN10impala_udf15FunctionContextE'],

Basically, this code is the entry of the front-end and it connects  the front-end to the back-end
using the function signature as you see here. If we take it out of the .h file, this will
be broken and queries like "select now()/current_timestamp();" will stop working.

Based on the above reason, maybe we should not touch this code.
Please feel free to point out my mistake if you think I am wrong. Thank you.


http://gerrit.cloudera.org:8080/#/c/4490/3/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS3, Line 121: utc_tim
> 'utc_timestamp', here and below
Done


http://gerrit.cloudera.org:8080/#/c/4490/3/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS3, Line 113: dinated Univer
> 'Coordinated Universal Time ("UTC")'
Done


PS3, Line 115: ck such as a manua
> Does DST on the local machine change the universal time?
Greetings, Jim.
The answer to this question is NO. I have corrected this comment.


PS3, Line 117: UtcTime() {
> 'UtcTime'
Done


http://gerrit.cloudera.org:8080/#/c/4490/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

Line 842:   //typedef boost::date_time::c_local_adjustor<ptime> local_adj;
> It's not a good use of a line to define a typedef that's used exactly once.
Done


PS3, Line 843: &
> Why are you using references, here and below?
Greetings, Jim.
My motivation is to avoid the repeated object constructions when they are assigned back and
forth. It is the same idea as Java reference. So your concern is maybe this will expose some
memory risks by doing so?


PS3, Line 843: unive
> 'universal_time'
Done


PS3, Line 844: local
> 'local_time'
Done


Line 845:     universal_time);
> DCHECK that ltime and utime are close - with 48 hours of each other, for in
Greetings, Jim.
This DCHECK task is easy to do. And do you think the threshold of 12 hours would be okay?
I believe the delta range between UTC and local time would not be bigger than 12 hours in
term of the absolute value.


PS3, Line 842:   //typedef boost::date_time::c_local_adjustor<ptime> local_adj;
             :   const ptime& universal_time = boost::posix_time::microsec_clock::universal_time();
             :   const ptime& local_time = boost::date_time::c_local_adjustor<ptime>::utc_to_local(
             :     universal_time);
             :   const time_duration td_delta = universal_time - local_time;
> I see your point. I personally don't trust any clock or any timezone db. :-
Greetings, Gentlemen.
The following is the output of the postgres:
Version: PostgreSQL 9.4.8 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.9.2-10) 4.9.2,
64-bit

postgres=# select now(), current_timestamp at time zone 'UTC';
              now              |          timezone          
-------------------------------+----------------------------
 2016-09-23 11:01:45.907497+08 | 2016-09-23 03:01:45.907497
(1 row)


===========================================================

The following is the output of the mysql:
mysql  Ver 14.14 Distrib 5.5.49, for debian-linux-gnu (x86_64) using readline 6.3

mysql> select now(), utc_timestamp();
+---------------------+---------------------+
| now()               | utc_timestamp()     |
+---------------------+---------------------+
| 2016-09-23 15:02:42 | 2016-09-23 07:02:42 |
+---------------------+---------------------+
1 row in set (0.16 sec)



Thanks. :)


PS3, Line 842:   //typedef boost::date_time::c_local_adjustor<ptime> local_adj;
             :   const ptime& universal_time = boost::posix_time::microsec_clock::universal_time();
             :   const ptime& local_time = boost::date_time::c_local_adjustor<ptime>::utc_to_local(
             :     universal_time);
             :   const time_duration td_delta = universal_time - local_time;
> I don't think it matters since (AFAIK) the functions don't both promise to 
Greetings, Gentlemen.
Yes, Jim has proposed a more strict restriction to eliminate errors between UTC and local
time. That is also why I do such modification from patch set 2 to patch set 3. 
And Matthew's idea also makes much sense here for if such functions are executed fast enough,
there would be no such error actually. That is what I have observed in my experiment. (I have
repeated this sql "select now(), utc_timestamp();" for many times, no significant error is
observed. And I know such error does ocurr under certain circumstances.)  
Considering it is not possible to take both solutions, so which solution do you think is more
applicable in our scenario? :)


http://gerrit.cloudera.org:8080/#/c/4490/3/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

PS3, Line 298: utc_timestamp_
> 'utc_timestamp_string'
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4490
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5ee6ee192aa469f77c711f27ad324696a42004d1
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Youwei Wang <youwei.a.wang@intel.com>
Gerrit-Reviewer: Jim Apple <jbapple@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Youwei Wang <youwei.a.wang@intel.com>
Gerrit-HasComments: Yes

Mime
View raw message