Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1F45B10CE5 for ; Thu, 17 Sep 2015 17:37:23 +0000 (UTC) Received: (qmail 77491 invoked by uid 500); 17 Sep 2015 17:36:48 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 77400 invoked by uid 500); 17 Sep 2015 17:36:48 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 77380 invoked by uid 99); 17 Sep 2015 17:36:48 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Sep 2015 17:36:48 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id 8FB36283FF9; Thu, 17 Sep 2015 17:36:46 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============4157702314942609606==" MIME-Version: 1.0 Subject: Re: Review Request 38429: HIVE-11786: Deprecate the use of redundant column in colunm stats related tables From: "Chaoyu Tang" To: "Xuefu Zhang" , "Ashutosh Chauhan" , "Sergey Shelukhin" Cc: "hive" , "Chaoyu Tang" Date: Thu, 17 Sep 2015 17:36:46 -0000 Message-ID: <20150917173646.9550.32507@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: "Chaoyu Tang" X-ReviewGroup: hive X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/38429/ X-Sender: "Chaoyu Tang" References: <20150916212614.3773.95751@reviews.apache.org> In-Reply-To: <20150916212614.3773.95751@reviews.apache.org> Reply-To: "Chaoyu Tang" X-ReviewRequest-Repository: hive-git --===============4157702314942609606== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit - Chaoyu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/38429/#review99307 ----------------------------------------------------------- On Sept. 17, 2015, 5:35 p.m., Chaoyu Tang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/38429/ > ----------------------------------------------------------- > > (Updated Sept. 17, 2015, 5:35 p.m.) > > > Review request for hive, Ashutosh Chauhan, Sergey Shelukhin, and Xuefu Zhang. > > > Bugs: HIVE-11786 > https://issues.apache.org/jira/browse/HIVE-11786 > > > Repository: hive-git > > > Description > ------- > > The stats tables such as TAB_COL_STATS, PART_COL_STATS have redundant columns such as DB_NAME, TABLE_NAME, PARTITION_NAME since these tables already have foreign key like TBL_ID, or PART_ID referencing to TBLS or PARTITIONS. But these columns are currently used in fetching column stats (e.g. getTableStats/getPartitionStats) so any Hive operation involved in db/table/partition name change has to update these columnn, which is not necessary and sometimes quite difficult in implementation given the limitations from DN and RawStore APIs. > This patch is to remove the use of these redundant columns at HMS code level. The changes include: > 1. Instead of directly using these columns in TAB_COL_STATS, PART_COL_STATS, use these in their referenced tables. > 2. currently the CBO code assumes that the column stats returned from HMS are in the same order as that passed in column request. It is not gurantteed and has been changed. > 3. The deprecated redundant columns are now temorarily populated with value "Deprecated". They will be removed in a followed up JIRA. > > > Diffs > ----- > > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 1f89b7c > metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 4d6bfcc > metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java b3ceff1 > metastore/src/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java 328a65c > metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java 2967a60 > metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java 132f7a1 > metastore/src/test/org/apache/hadoop/hive/metastore/VerifyingObjectStore.java 7e46523 > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 6c0bd25 > > Diff: https://reviews.apache.org/r/38429/diff/ > > > Testing > ------- > > 1. Manually tested some cases against MySQL/PostgreSQL/Oracle. > 2. Is running precommit test. > > > Thanks, > > Chaoyu Tang > > --===============4157702314942609606==--