hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chaoyu Tang" <ctang...@gmail.com>
Subject Re: Review Request 31978: HIVE-9720: Metastore does not properly migrate column stats when renaming a table across databases
Date Thu, 12 Mar 2015 14:35:45 GMT


> On March 12, 2015, 1:32 p.m., Xuefu Zhang wrote:
> >

Thanks Xuefu! It looks like the InvalidObjectException only supports three constructors without
the InvalidObjectException(String message, Throwable cause), and its string message is the
only thrift field which can pass between client/server.

public InvalidObjectException() {
  }
  public InvalidObjectException(
    String message)
  {
    this();
    this.message = message;
  }
  public InvalidObjectException(InvalidObjectException other) {
    if (other.isSetMessage()) {
      this.message = other.message;
    }
  }


- Chaoyu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31978/#review76227
-----------------------------------------------------------


On March 12, 2015, 10:55 a.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31978/
> -----------------------------------------------------------
> 
> (Updated March 12, 2015, 10:55 a.m.)
> 
> 
> Review request for hive, Brock Noland, Chao Sun, Szehon Ho, and Xuefu Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Alter table .. rename/change column did not change or invalidate the columns stats data
in TAB_COL_STATS or PART_COL_STATS, which lead to inconsistent data in these tables and caused
the issues as reported in HIVE-9720 and HIVE-9866. For example, if we do alter table .. rename
and move the table to a different database, all related metadata has changed except those
in TAB_COL_STATS/PART_COL_STATS. When we drop the moved table, Hive needs delete its column
stats data first if it was computed, but it could not since the DB_NAME stored in TAB_COL_STATS
does not match the actual DB_NAME, therefore causing the referiential violation seen in HIVE-9720.
For another example, after we change a table column type, say from int to string using alter
table ... change ..., and if the column stats is computed before and  after the change, you
will find this column has the stats data for both int and string, which is not correct. 
> This patch is to fix these issues by removing invalid column stats data from TAB_COL_STATS/PART_COL_STATS
after the change in db, table, partition and column type for a column.
> 
> 
> Diffs
> -----
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java d99cfdf74d57071183e9385b5a3f2c5335e4ce60

>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 612f927d5515fbd5c3257a04b85f9bcc4c6891f3

>   ql/src/test/queries/clientpositive/alter_table_invalidate_column_stats.q PRE-CREATION

>   ql/src/test/results/clientpositive/alter_table_invalidate_column_stats.q.out PRE-CREATION

> 
> Diff: https://reviews.apache.org/r/31978/diff/
> 
> 
> Testing
> -------
> 
> 1. Manual tests:
> Went through cases in alter_table_invalidate_column_stats.q and checked TAB_COL_STATS/PART_COL_STATS
to make sure that the invalid column stats has been cleaned after alter table ..., alter table
... cascade, alter table partition ..., with sqldirect and ORM.
> 2. new qtest alter_table_invalidate_column_stats.q was added and the patch has been submitted
to kick off precommitted build
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message