impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Skye Wanderman-Milne (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option
Date Fri, 01 Apr 2016 23:23:37 GMT
Hello Matthew Jacobs, Dan Hecht,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/2384

to look at the new patch set (#12).

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option
......................................................................

IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

This patch introduces a new query option,
PARQUET_FALLBACK_SCHEMA_RESOLUTION which allows Parquet files' schemas
to be resolved by either name or position.  It's "fallback" because
eventually field IDs will be the primary schema resolution scheme, and
we don't want to create an option that we will have to change the name
of later. The default is still by position. I chose to do a query
option because it will make testing easier and also be easier to
diagnose resolution problems quickly in the field. If users want to
switch the default behavior to be by name (like Hive), they can use
the --default_query_options flag.

This patch also introduces a new test section, SHELL, which can be
used to execute shell commands in a .test file. This is useful for
copying files into test tables.

Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A testdata/parquet_schema_resolution/README
A testdata/parquet_schema_resolution/switched_map.avsc
A testdata/parquet_schema_resolution/switched_map.json
A testdata/parquet_schema_resolution/switched_map.parq
A testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M tests/common/impala_test_suite.py
M tests/conftest.py
M tests/query_test/test_scanners.py
M tests/util/test_file_parser.py
15 files changed, 395 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/84/2384/12
-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 12
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne <skye@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Juan Yu <jyu@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Silvius Rus <srus@cloudera.com>
Gerrit-Reviewer: Skye Wanderman-Milne <skye@cloudera.com>

Mime
View raw message