Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 62881200B34 for ; Sat, 18 Jun 2016 00:50:07 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 61210160A66; Fri, 17 Jun 2016 22:50:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A9994160A61 for ; Sat, 18 Jun 2016 00:50:06 +0200 (CEST) Received: (qmail 28180 invoked by uid 500); 17 Jun 2016 22:50:05 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 28159 invoked by uid 99); 17 Jun 2016 22:50:05 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2016 22:50:05 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 5DF482C1F5C for ; Fri, 17 Jun 2016 22:50:05 +0000 (UTC) Date: Fri, 17 Jun 2016 22:50:05 +0000 (UTC) From: "Carter Shanklin (JIRA)" To: dev@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HIVE-14053) Hive should report that primary keys can't be null. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 17 Jun 2016 22:50:07 -0000 Carter Shanklin created HIVE-14053: -------------------------------------- Summary: Hive should report that primary keys can't be null. Key: HIVE-14053 URL: https://issues.apache.org/jira/browse/HIVE-14053 Project: Hive Issue Type: Bug Reporter: Carter Shanklin HIVE-13076 introduces "rely novalidate" primary and foreign keys to Hive. With the right driver in place, tools like Tableau can do join elimination and queries can run much faster. Some gaps remain, currently getAttributes() in HiveDatabaseMetaData doesn't work quite right for keys. In particular, primary keys by definition are not null and the metadata should reflect this for improved join elimination. In this example that uses the TPC-H schema and its constraints, we sum l_extendedprice and group by l_shipmode. This query should not use more than just the lineitem table. With all the constraints in place, Tableau generates this query: {code} SELECT `lineitem`.`l_shipmode` AS `l_shipmode`, SUM(`lineitem`.`l_extendedprice`) AS `sum_l_extendedprice_ok` FROM `tpch_bin_flat_orc_2`.`lineitem` `lineitem` JOIN `tpch_bin_flat_orc_2`.`orders` `orders` ON (`lineitem`.`l_orderkey` = `orders`.`o_orderkey`) JOIN `tpch_bin_flat_orc_2`.`customer` `customer` ON (`orders`.`o_custkey` = `customer`.`c_custkey`) JOIN `tpch_bin_flat_orc_2`.`nation` `nation` ON (`customer`.`c_nationkey` = `nation`.`n_nationkey`) WHERE ((((NOT (`lineitem`.`l_partkey` IS NULL)) AND (NOT (`lineitem`.`l_suppkey` IS NULL))) AND ((NOT (`lineitem`.`l_partkey` IS NULL)) AND (NOT (`lineitem`.`l_suppkey` IS NULL)))) AND (NOT (`nation`.`n_regionkey` IS NULL))) {code} Since these are the primary keys the denormalization and the where condition is unnecessary and this sort of query can be a lot faster by just accessing the lineitem table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)