impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hecht (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4440: lineage timestamps can go backwards across daylight savings transitions
Date Mon, 21 Nov 2016 17:24:54 GMT
Hello Dimitris Tsirogiannis,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5129

to look at the new patch set (#2).

Change subject: IMPALA-4440: lineage timestamps can go backwards across daylight savings transitions
......................................................................

IMPALA-4440: lineage timestamps can go backwards across daylight savings transitions

Using TimestampValue (or equivalent string representation) for
timestamps that require a point in time doesn't work because the same
time can represent multiple point in times.  For example, the timestamp:
'2016-11-13 01:01 AM' occurred twice last weekend.

Instead, we should use unix time directly rather than trying to derive
unix time from a (timezone-less) timestamp.

Note that there are other questionable uses of TimestampValue for
internal Impala service stuff, but I want to fix them separately as they
are not as important and fixing does add some risk.

While I'm here, remove a template TimestampValue constructor that was
unused and is confusing.

We don't have any end-to-end tests that exercise column lineage, so add
a simple custom cluster test that enables lineage and verifes the start
and end unix times are within appropriate bounds.  The other column
lineage graph fields are at least tested via planner tests.

Automated regression testing for the specifc daylight savings issue is
difficult as we'd have to cross the daylight savings boundary at just
the right time during query execution in order to reproduce
reliably. But open to ideas.

Testing:
- loop the new test overnight without any failures.
- exhaustive run.

Change-Id: I34e435fc3511e65bc62906205cb558f2c116a8a9
---
M be/src/runtime/timestamp-value.h
M be/src/service/impala-server.cc
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
A tests/custom_cluster/test_lineage.py
6 files changed, 86 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/5129/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5129
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34e435fc3511e65bc62906205cb558f2c116a8a9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>

Mime
View raw message