impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Internal Jenkins (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3943: Do not throw scan errors for empty Parquet files.
Date Wed, 12 Oct 2016 09:22:58 GMT
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3943: Do not throw scan errors for empty Parquet files.
......................................................................


IMPALA-3943: Do not throw scan errors for empty Parquet files.

For Parquet files with no row groups but with num_rows=0 in the
file footer the Parquet scanner returns an error indicating
that the file is invalid. This behavior is a regression from
previous Impala versions which used to accept such files.

This patch restores the previous behavior and adds tests.

Change-Id: I50ac3df6ff24bc5c384ef22e0f804a5132adb62e
Reviewed-on: http://gerrit.cloudera.org:8080/4693
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
---
M be/src/exec/hdfs-parquet-scanner.cc
M testdata/data/README
A testdata/data/zero_rows_one_row_group.parquet
A testdata/data/zero_rows_zero_row_groups.parquet
A testdata/workloads/functional-query/queries/QueryTest/parquet-zero-rows.test
M tests/query_test/test_scanners.py
6 files changed, 65 insertions(+), 1 deletion(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I50ac3df6ff24bc5c384ef22e0f804a5132adb62e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>

Mime
View raw message