impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5286/IMPALA-5283: Kudu column name case cleanup
Date Fri, 16 Jun 2017 00:58:41 GMT
Thomas Tauber-Marshall has uploaded a new patch set (#4).

Change subject: IMPALA-5286/IMPALA-5283: Kudu column name case cleanup
......................................................................

IMPALA-5286/IMPALA-5283: Kudu column name case cleanup

Impala is case insensitive for column names and generally deals
with them in all lower case. Kudu is case sensitive. This can
lead to a problems when a table is created externally in Kudu
with a column name with upper case letters.

This patch solves the problem by having KuduColumn always store
its name in lower case, so that general Impala code that has been
written expecting lower cased column names can use Column.getName()
safely.

It also adds the method KuduColumn.getKuduName(), which returns
the column name in the case that it appears in Kudu. Any code that
passes column names into the Kudu API must call this method first
to get the correct column name.

There are four specific situations fixed by this patch:
- When ordering on a Kudu column, the Analyzer would create
  two SlotDescriptors that point to the same column because
  registerSlotRef() was being called with inconsistent casing.
  It is now always called with the lower cased names.
- 'ADD RANGE PARTITION' would fail to find the range partition
  column if it isn't all lower case in Kudu.
- 'ALTER TABLE DROP COLUMN' and 'ALTER TABLE CHANGE' only worked
  if the column name was specified in Kudu case.
- 'CREATE EXTERNAL TABLE' called on a Kudu table with column names
  that differ only in case now returns an error, since Impala has
  no way of handling this situation.

Testing:
- Added e2e tests in test_kudu.py.
- Manually edited functional_kudu to change column names to have
  mixed casing and ran the kudu tests.

Change-Id: I14aba88510012174716691b9946e1c7d54d01b44
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/catalog/Column.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M tests/query_test/test_kudu.py
9 files changed, 122 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/6902/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I14aba88510012174716691b9946e1c7d54d01b44
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarshall@cloudera.com>

Mime
View raw message