impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Attila Jeges (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-2716: Hive/Impala incompatibility for timestamp data in Parquet
Date Thu, 23 Mar 2017 15:40:40 GMT
Attila Jeges has posted comments on this change.

Change subject: IMPALA-2716: Hive/Impala incompatibility for timestamp data in Parquet
......................................................................


Patch Set 5:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 304:   // Time zone for adjusting timestamp values read from Parquet files written by
> single line
That would make the line 92 characters long.


http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 246:       DCHECK((is_timestamp_dependent_timezone_ && timezone_ == NULL) ||
> Even better, we can check this at the scan node level, e.g., in HdfsScanNod
Simplified the code. The timezone check is done in the FE, the BE is guaranteed to have a
valid timezone string.


Line 653:   }
> As mentioned above, we should terminate the query early when in such a bad 
Added DCHECK(false)


http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

Line 194:   for (const string& tz_region: tz_region_list_) {
> No need to use auto here. const string& is clearer
Done


http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/exprs/timezone_db.h
File be/src/exprs/timezone_db.h:

Line 40:   /// If 'tz' is not found in the database, nullptr is returned.
> null pointer -> nullptr
Done


Line 46:   /// If 'tz' is not found in the database, nullptr is returned.
> null pointer -> nullptr
Done


http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

Line 199:   /// conversion was successfull and false otherwise. If conversion failed *this
is set to
> Comment on what the state of *this is when false is returned.
Done


Line 202: 
> Comment on what the state of *this is when false is returned.
Done


http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

Line 357:   const char *tz = env->GetStringUTFChars(timezone, NULL);
> const char*
Done


http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

Line 122:     "the parquet.mr.int96.write.zone table property to UTC for new tables. This
changes "
> consistent placing of space (at line end or at line beginning)
Done


http://gerrit.cloudera.org:8080/#/c/5939/3/fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
File fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java:

Line 67:     analyzeJoin(analyzer);
> If we do the validation here, then there is no need to check in each HdfsSc
Done


http://gerrit.cloudera.org:8080/#/c/5939/3/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

Line 184:       if (getFileFormat() == THdfsFileFormat.KUDU) {
> You can check getFileFormat() here. The only non-HDFS formats we have are H
Done


Line 189:       String timezone = getTblProperties().get(HdfsTable.TBL_PROP_PARQUET_MR_WRITE_ZONE);
> We should only do this for HDFS tables right?
Done


http://gerrit.cloudera.org:8080/#/c/5939/4/fe/src/main/java/org/apache/impala/service/FeSupport.java
File fe/src/main/java/org/apache/impala/service/FeSupport.java:

Line 87:   // Returns true if timezone String is valid according to the BE timezone database,
false
> valid according to the BE timezone database
Done


http://gerrit.cloudera.org:8080/#/c/5939/4/tests/custom_cluster/test_parquet_timestamp_compatibility.py
File tests/custom_cluster/test_parquet_timestamp_compatibility.py:

Line 1: # Licensed to the Apache Software Foundation (ASF) under one
> Use new Apache license header. Do we still have the old header in some exis
My bad, I created this test from an old test script.


-- 
To view, visit http://gerrit.cloudera.org:8080/5939
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3f24525ef45a2814f476bdee76655b30081079d6
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <attilaj@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Attila Jeges <attilaj@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi+gerrit@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message