impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (IMPALA-5283) Handle case sensitivity naming conflicts in Kudu tables
Date Mon, 19 Jun 2017 18:24:00 GMT

     [ https://issues.apache.org/jira/browse/IMPALA-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thomas Tauber-Marshall resolved IMPALA-5283.
--------------------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.10.0

commit 7f3817982ff7968193e77abff051a52fc6e0d8cf
Author: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Date:   Fri May 12 12:23:42 2017 -0700

    IMPALA-5286/IMPALA-5283: Kudu column name case cleanup
    
    Impala is case insensitive for column names and generally deals
    with them in all lower case. Kudu is case sensitive. This can
    lead to a problems when a table is created externally in Kudu
    with a column name with upper case letters.
    
    This patch solves the problem by having KuduColumn always store
    its name in lower case, so that general Impala code that has been
    written expecting lower cased column names can use Column.getName()
    safely.
    
    It also adds the method KuduColumn.getKuduName(), which returns
    the column name in the case that it appears in Kudu. Any code that
    passes column names into the Kudu API must call this method first
    to get the correct column name.
    
    There are four specific situations fixed by this patch:
    - When ordering on a Kudu column, the Analyzer would create
      two SlotDescriptors that point to the same column because
      registerSlotRef() was being called with inconsistent casing.
      It is now always called with the lower cased names.
    - 'ADD RANGE PARTITION' would fail to find the range partition
      column if it isn't all lower case in Kudu.
    - 'ALTER TABLE DROP COLUMN' and 'ALTER TABLE CHANGE' only worked
      if the column name was specified in Kudu case.
    - 'CREATE EXTERNAL TABLE' called on a Kudu table with column names
      that differ only in case now returns an error, since Impala has
      no way of handling this situation.
    
    Testing:
    - Added e2e tests in test_kudu.py.
    - Manually edited functional_kudu to change column names to have
      mixed casing and ran the kudu tests.
    
    Change-Id: I14aba88510012174716691b9946e1c7d54d01b44
    Reviewed-on: http://gerrit.cloudera.org:8080/6902
    Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
    Tested-by: Impala Public Jenkins

> Handle case sensitivity naming conflicts in Kudu tables
> -------------------------------------------------------
>
>                 Key: IMPALA-5283
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5283
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 2.8.0
>            Reporter: Matthew Jacobs
>            Assignee: Thomas Tauber-Marshall
>              Labels: kudu
>             Fix For: Impala 2.10.0
>
>
> Kudu supports case sensitive table/column names while Impala (and Hive) do not (names
get lower-cased).
> Conflicting column names will cause problems for Impala, e.g.
> {code}
> table foo (
> int Col,
> int col ) ... stored as Kudu
> {code}
> We should return an error when loading a table such as this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message