Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DEE74200C47 for ; Thu, 30 Mar 2017 19:58:46 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id DD580160B7E; Thu, 30 Mar 2017 17:58:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0B56B160B8B for ; Thu, 30 Mar 2017 19:58:45 +0200 (CEST) Received: (qmail 65323 invoked by uid 500); 30 Mar 2017 17:58:45 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 65309 invoked by uid 99); 30 Mar 2017 17:58:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Mar 2017 17:58:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id A929F1AA2A2 for ; Thu, 30 Mar 2017 17:58:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id sC8GW8GXd29Q for ; Thu, 30 Mar 2017 17:58:43 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id A1FE75FE2F for ; Thu, 30 Mar 2017 17:58:42 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id E5F51E0A17 for ; Thu, 30 Mar 2017 17:58:41 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id A3D9B21DD8 for ; Thu, 30 Mar 2017 17:58:41 +0000 (UTC) Date: Thu, 30 Mar 2017 17:58:41 +0000 (UTC) From: "Sahil Takiar (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 30 Mar 2017 17:58:47 -0000 [ https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949511#comment-15949511 ] Sahil Takiar commented on HIVE-15396: ------------------------------------- Good point. How about the approach in my 3rd patch? It checks if the data location is empty or not. If it is empty, all stats are collected, if it isn't then only basic stats are added. I'll remove the check for {{isExternal()}}. > Basic Stats are not collected when for managed tables with LOCATION specified > ----------------------------------------------------------------------------- > > Key: HIVE-15396 > URL: https://issues.apache.org/jira/browse/HIVE-15396 > Project: Hive > Issue Type: Bug > Components: Statistics > Reporter: Sahil Takiar > Assignee: Sahil Takiar > Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, HIVE-15396.3.patch, HIVE-15396.4.patch > > > Basic stats are not collected when a managed table is created with a specified {{LOCATION}} clause. > {code} > 0: jdbc:hive2://localhost:10000> create table hdfs_1 (col int); > 0: jdbc:hive2://localhost:10000> describe formatted hdfs_1; > +-------------------------------+----------------------------------------------------+-----------------------------+ > | col_name | data_type | comment | > +-------------------------------+----------------------------------------------------+-----------------------------+ > | # col_name | data_type | comment | > | | NULL | NULL | > | col | int | | > | | NULL | NULL | > | # Detailed Table Information | NULL | NULL | > | Database: | default | NULL | > | Owner: | anonymous | NULL | > | CreateTime: | Wed Mar 22 18:09:19 PDT 2017 | NULL | > | LastAccessTime: | UNKNOWN | NULL | > | Retention: | 0 | NULL | > | Location: | file:/warehouse/hdfs_1 | NULL | > | Table Type: | MANAGED_TABLE | NULL | > | Table Parameters: | NULL | NULL | > | | COLUMN_STATS_ACCURATE | {\"BASIC_STATS\":\"true\"} | > | | numFiles | 0 | > | | numRows | 0 | > | | rawDataSize | 0 | > | | totalSize | 0 | > | | transient_lastDdlTime | 1490231359 | > | | NULL | NULL | > | # Storage Information | NULL | NULL | > | SerDe Library: | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL | > | InputFormat: | org.apache.hadoop.mapred.TextInputFormat | NULL | > | OutputFormat: | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL | > | Compressed: | No | NULL | > | Num Buckets: | -1 | NULL | > | Bucket Columns: | [] | NULL | > | Sort Columns: | [] | NULL | > | Storage Desc Params: | NULL | NULL | > | | serialization.format | 1 | > +-------------------------------+----------------------------------------------------+-----------------------------+ > 0: jdbc:hive2://localhost:10000> create table s3_1 (col int) location 's3a://[bucket]/test-tables/s3-1'; > 0: jdbc:hive2://localhost:10000> describe formatted s3_1; > +-------------------------------+----------------------------------------------------+-----------------------+ > | col_name | data_type | comment | > +-------------------------------+----------------------------------------------------+-----------------------+ > | # col_name | data_type | comment | > | | NULL | NULL | > | col | int | | > | | NULL | NULL | > | # Detailed Table Information | NULL | NULL | > | Database: | default | NULL | > | Owner: | anonymous | NULL | > | CreateTime: | Wed Mar 22 18:10:01 PDT 2017 | NULL | > | LastAccessTime: | UNKNOWN | NULL | > | Retention: | 0 | NULL | > | Location: | s3a://[bucket]/test-tables/s3-1 | NULL | > | Table Type: | MANAGED_TABLE | NULL | > | Table Parameters: | NULL | NULL | > | | transient_lastDdlTime | 1490231401 | > | | NULL | NULL | > | # Storage Information | NULL | NULL | > | SerDe Library: | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL | > | InputFormat: | org.apache.hadoop.mapred.TextInputFormat | NULL | > | OutputFormat: | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL | > | Compressed: | No | NULL | > | Num Buckets: | -1 | NULL | > | Bucket Columns: | [] | NULL | > | Sort Columns: | [] | NULL | > | Storage Desc Params: | NULL | NULL | > | | serialization.format | 1 | > +-------------------------------+----------------------------------------------------+-----------------------+ > {code} > There are no stats defined in the describe for the s3 table. Furthermore, when inserting into the s3 table the {{numRows}} stats are not collected for the s3 table. -- This message was sent by Atlassian JIRA (v6.3.15#6346)