impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Internal Jenkins (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-(3895,3859): Don't log file data on parse errors
Date Thu, 25 Aug 2016 10:20:36 GMT
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-(3895,3859): Don't log file data on parse errors
......................................................................


IMPALA-(3895,3859): Don't log file data on parse errors

Logging file or table data is a bad idea, and doing it by default is
particularly bad. This patch changes HdfsScanNode::LogRowParseError() to
log a file and offset only.

Testing: See rewritten tests.

To support testing this change, we also fix IMPALA-3895, by introducing
a canonical string __HDFS_FILENAME__ that all Hadoop filenames in the ERROR
output are replaced with before comparing with the expected
results. This fixes a number of issues with the old way of matching
filenames which purported to be a regex, but really wasn't. In
particular, we can now match the rest of an ERROR line after the
filename, which was not possible before.

In some cases, we don't want to substitute filenames because the ERROR
output is looking for a very specific output. In that case we can write:

$NAMENODE/<filename>

and this patch will not perform _any_ filename substitutions on ERROR
sections that contain the $NAMENODE string.

Finally, this patch fixes a bug where a test that had an ERRORS section
but no RESULTS section would silently pass without testing anything.

Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272
Reviewed-on: http://gerrit.cloudera.org:8080/4020
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Internal Jenkins
---
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
M testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test
M testdata/workloads/functional-query/queries/DataErrorsTest/hbase-scan-node-errors.test
M testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-rcfile-scan-node-errors.test
M testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test
M testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-sequence-scan-errors.test
M testdata/workloads/functional-query/queries/QueryTest/parquet-continue-on-error.test
M testdata/workloads/functional-query/queries/QueryTest/strict-mode-abort.test
M testdata/workloads/functional-query/queries/QueryTest/strict-mode.test
M tests/common/impala_test_suite.py
M tests/common/test_result_verifier.py
M tests/util/filesystem_utils.py
M tests/util/hdfs_util.py
19 files changed, 393 insertions(+), 407 deletions(-)

Approvals:
  Henry Robinson: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4020
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message