impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3894: Change the behavior parsing date "YY"
Date Wed, 02 Aug 2017 00:15:13 GMT
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3894: Change the behavior parsing date "YY"

Patch Set 9:


A few code comments that I think you can address now, plus a few behavior questions for Greg
- just to double check he's OK with this new behavior. Lets get his input on them before proceeding.
Commit Message:

PS9, Line 11: same as Hive's
How did you determine exactly what Hive's behavior is - did you find a reference in documentation,
or code? Can you add a link?

PS9, Line 12: into the interval [current time - 80 years, current time + 20 years).
looks like postgresql's behavior is different. I think they just break at the unix epoch.
Lets see if Greg is OK with us following Hive's behavior if that differs from Postgres. We
may wanna look at a few more databases to see if there is consensus.

mj=# select to_date('05 Dec 69', 'DD Mon YY');
(1 row)

mj=# select to_date('05 Dec 70', 'DD Mon YY');
(1 row)
File be/src/runtime/

PS9, Line 26: #
not needed here if it's also included in the header

PS9, Line 448: // Taking tok_len > 3 literally is Hive behavior.
this comment isn't clear to me

PS9, Line 547: unsigned short century_start_year = now_date.year() - 80;
             :     date century_start_date(century_start_year, now_date.month(),;
you can use boost operator-(time_duration)

this can also be precomputed as part of the DateTimeContext to avoid some cycles on every

PS9, Line 557: TimestampValue
just use ptime directly. we're trying to avoid instantiating TimestampValue except for real
data (i.e. as part of a tuple)

PS9, Line 558: TimestampValue

PS9, Line 558: TimestampValue(century_start_date,
it might be worth pre-computing this as part of the DateTimeFormatContext to cut down on the
overhead when parsing a lot of rows
File be/src/runtime/

PS9, Line 481:     // Test 1-digit year format with time to show the exact boundary
             :     (TimestampTC("y-MM-dd HH:mm:ss", "37-07-28 16:14:23", false, true, false,
             :                  2037, 7, 28, 16, 14, 23))
             :     (TimestampTC("y-MM-dd HH:mm:ss", "37-07-28 16:14:24", false, true, false,
             :                  1937, 7, 28, 16, 14, 24))
hm, postgres does something different with 'y', let's see what Greg wants us to do. It always
adds 2000:

mj=# select to_date('05 Dec 0', 'DD Mon y');
(1 row)

mj=# select to_date('05 Dec 9', 'DD Mon y');
(1 row)

mj=# select to_date('05 Dec 99', 'DD Mon y');
(1 row)

mj=# select to_date('05 Dec 999', 'DD Mon y');
(1 row)

mj=# select to_date('05 Dec 9999', 'DD Mon y');
(1 row)

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I5da761255915dc741f1dcc488fd4ef6ecc385896
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <>
Gerrit-Reviewer: Greg Rahn <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Tianyi Wang <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message