impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hecht (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4363: Add Parquet timestamp validation
Date Fri, 18 Nov 2016 22:17:08 GMT
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4363: Add Parquet timestamp validation
......................................................................


Patch Set 6:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4968/6/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 526:     parent_->parse_status_ = Status(*msg);
the claim is we are pulling this out to avoid status construction code, but didn't we already
construct a status if we call this function (in LogOrReturnError())?


PS6, Line 598:  
nit: extraneous space


Line 599:   if (!src->IsValidDate()) {
UNLIKELY


http://gerrit.cloudera.org:8080/#/c/4968/6/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

PS6, Line 512: 
was this an incorrect comment?


PS6, Line 449: void* slot
shouldn't this be 'tuple'.  but the fact that this compiled probably means that this method
isn't actually ever called or defined (i.e. the one in BaseScalarColumnReader), and so this
declaration should just be deleted.


http://gerrit.cloudera.org:8080/#/c/4968/6/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

Line 156:     return date_.day_number() >= MIN_DAY_NUMBER && date_.day_number()
<= MAX_DAY_NUMBER;
are we sure these are the only invalidate TimestampValue's that may have been copied in? 
Are there any invalid time_ values?


http://gerrit.cloudera.org:8080/#/c/4968/6/testdata/bad_parquet_data/README
File testdata/bad_parquet_data/README:

PS6, Line 16: out-of-range-timestamp.parq
> The parquet file is no longer in this folder, so I will remove this comment
not sure what you mean -- aren't you adding a parquet file to test this? could you just make
whatever change you are proposing.


http://gerrit.cloudera.org:8080/#/c/4968/6/testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-abort-on-error.test
File testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-abort-on-error.test:

PS6, Line 7: Parquet file '$NAM
why is the message repeated?


-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <laszlo.gaal@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message