hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carter Shanklin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-16938) INFORMATION_SCHEMA usability: difficult to access # of table records
Date Thu, 22 Jun 2017 03:45:00 GMT
Carter Shanklin created HIVE-16938:
--------------------------------------

             Summary: INFORMATION_SCHEMA usability: difficult to access # of table records
                 Key: HIVE-16938
                 URL: https://issues.apache.org/jira/browse/HIVE-16938
             Project: Hive
          Issue Type: Bug
            Reporter: Carter Shanklin


HIVE-1010 adds an information schema to Hive, also taking the opportunity to expose some non-standard
but valuable things like statistics in a SYS table.

One common thing users want to know is the number of rows in tables, system wide.

This information is in the table_params table but the structure of this table makes it quite
inconvenient to access since it is essentially a table of key-value pairs. More table stats
are likely to be added over time, especially because of ACID. It would be a lot better if
this were a first class table.

For what it's worth I deal with the current table by pivoting it into something easier to
deal with as follows:

{code}
create view table_stats as
select
  tbl_id,
  max(case param_key when 'COLUMN_STATS_ACCURATE' then param_value end) as COLUMN_STATS_ACCURATE,
  max(case param_key when 'numFiles' then param_value end) as numFiles,
  max(case param_key when 'numRows' then param_value end) as numRows,
  max(case param_key when 'rawDataSize' then param_value end) as rawDataSize,
  max(case param_key when 'totalSize' then param_value end) as totalSize,
  max(case param_key when 'transient_lastDdlTime' then param_value end) as transient_lastDdlTime
from table_params group by tbl_id;
{code}

It would be better to not have users provide workarounds and make table stats first-class
like column stats currently are.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message