hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sahil Takiar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified
Date Thu, 30 Mar 2017 17:58:41 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949511#comment-15949511
] 

Sahil Takiar commented on HIVE-15396:
-------------------------------------

Good point. How about the approach in my 3rd patch? It checks if the data location is empty
or not. If it is empty, all stats are collected, if it isn't then only basic stats are added.
I'll remove the check for {{isExternal()}}.

> Basic Stats are not collected when for managed tables with LOCATION specified
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-15396
>                 URL: https://issues.apache.org/jira/browse/HIVE-15396
>             Project: Hive
>          Issue Type: Bug
>          Components: Statistics
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>         Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, HIVE-15396.3.patch, HIVE-15396.4.patch
>
>
> Basic stats are not collected when a managed table is created with a specified {{LOCATION}}
clause.
> {code}
> 0: jdbc:hive2://localhost:10000> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:10000> describe formatted hdfs_1;
> +-------------------------------+----------------------------------------------------+-----------------------------+
> |           col_name            |                     data_type                     
|           comment           |
> +-------------------------------+----------------------------------------------------+-----------------------------+
> | # col_name                    | data_type                                         
| comment                     |
> |                               | NULL                                              
| NULL                        |
> | col                           | int                                               
|                             |
> |                               | NULL                                              
| NULL                        |
> | # Detailed Table Information  | NULL                                              
| NULL                        |
> | Database:                     | default                                           
| NULL                        |
> | Owner:                        | anonymous                                         
| NULL                        |
> | CreateTime:                   | Wed Mar 22 18:09:19 PDT 2017                      
| NULL                        |
> | LastAccessTime:               | UNKNOWN                                           
| NULL                        |
> | Retention:                    | 0                                                 
| NULL                        |
> | Location:                     | file:/warehouse/hdfs_1 | NULL                     
  |
> | Table Type:                   | MANAGED_TABLE                                     
| NULL                        |
> | Table Parameters:             | NULL                                              
| NULL                        |
> |                               | COLUMN_STATS_ACCURATE                             
| {\"BASIC_STATS\":\"true\"}  |
> |                               | numFiles                                          
| 0                           |
> |                               | numRows                                           
| 0                           |
> |                               | rawDataSize                                       
| 0                           |
> |                               | totalSize                                         
| 0                           |
> |                               | transient_lastDdlTime                             
| 1490231359                  |
> |                               | NULL                                              
| NULL                        |
> | # Storage Information         | NULL                                              
| NULL                        |
> | SerDe Library:                | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
| NULL                        |
> | InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat          
| NULL                        |
> | OutputFormat:                 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
| NULL                        |
> | Compressed:                   | No                                                
| NULL                        |
> | Num Buckets:                  | -1                                                
| NULL                        |
> | Bucket Columns:               | []                                                
| NULL                        |
> | Sort Columns:                 | []                                                
| NULL                        |
> | Storage Desc Params:          | NULL                                              
| NULL                        |
> |                               | serialization.format                              
| 1                           |
> +-------------------------------+----------------------------------------------------+-----------------------------+
> 0: jdbc:hive2://localhost:10000> create table s3_1 (col int) location 's3a://[bucket]/test-tables/s3-1';
> 0: jdbc:hive2://localhost:10000> describe formatted s3_1;
> +-------------------------------+----------------------------------------------------+-----------------------+
> |           col_name            |                     data_type                     
|        comment        |
> +-------------------------------+----------------------------------------------------+-----------------------+
> | # col_name                    | data_type                                         
| comment               |
> |                               | NULL                                              
| NULL                  |
> | col                           | int                                               
|                       |
> |                               | NULL                                              
| NULL                  |
> | # Detailed Table Information  | NULL                                              
| NULL                  |
> | Database:                     | default                                           
| NULL                  |
> | Owner:                        | anonymous                                         
| NULL                  |
> | CreateTime:                   | Wed Mar 22 18:10:01 PDT 2017                      
| NULL                  |
> | LastAccessTime:               | UNKNOWN                                           
| NULL                  |
> | Retention:                    | 0                                                 
| NULL                  |
> | Location:                     | s3a://[bucket]/test-tables/s3-1     | NULL        
         |
> | Table Type:                   | MANAGED_TABLE                                     
| NULL                  |
> | Table Parameters:             | NULL                                              
| NULL                  |
> |                               | transient_lastDdlTime                             
| 1490231401            |
> |                               | NULL                                              
| NULL                  |
> | # Storage Information         | NULL                                              
| NULL                  |
> | SerDe Library:                | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
| NULL                  |
> | InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat          
| NULL                  |
> | OutputFormat:                 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
| NULL                  |
> | Compressed:                   | No                                                
| NULL                  |
> | Num Buckets:                  | -1                                                
| NULL                  |
> | Bucket Columns:               | []                                                
| NULL                  |
> | Sort Columns:                 | []                                                
| NULL                  |
> | Storage Desc Params:          | NULL                                              
| NULL                  |
> |                               | serialization.format                              
| 1                     |
> +-------------------------------+----------------------------------------------------+-----------------------+
> {code}
> There are no stats defined in the describe for the s3 table. Furthermore, when inserting
into the s3 table the {{numRows}} stats are not collected for the s3 table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message