hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vivek Sharma <vlsha...@hotmail.com>
Subject Re: Num Rows computed by EXPLAIN - Hive
Date Thu, 09 Feb 2017 08:07:04 GMT
Thanks Pengcheng. It worked. Now I am curious to know, where is this data stored ? i.e. the
data about the table statistics. For example, in Oracle we store these in DBA_TABLES and so
on.


Regards

Vivek


________________________________
From: Pengcheng Xiong <pxiong@apache.org>
Sent: Thursday, February 9, 2017 7:12 AM
To: user@hive.apache.org
Subject: Re: Num Rows computed by EXPLAIN - Hive

Did u run "analyze table emp compute statistics" before you run the explain? thanks.

Pengcheng

On Wed, Feb 8, 2017 at 9:29 PM, Vivek Sharma <vlsharma@hotmail.com<mailto:vlsharma@hotmail.com>>
wrote:

Hi,


I am new to Hive (just few days of learning).


I am an Oracle Performance Expert and am comparing the Explain feature of Hive with Oracle
Explain Plan command. I have an Internal Table with around 100 Rows in it. However, the Explain
command in Hive computes this as 33, which looks to be huge discrepancy and cause a performance
issues (for a larger table). Wanted to know the internal calculations of Hive to come out
with the NUM ROWS. The output is pasted below (The table actually has 100 Rows) :


hive (vivek)> explain select country, empno from emp;
OK
Explain
STAGE DEPENDENCIES:
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        TableScan
          alias: emp
          Statistics: Num rows: 33 Data size: 3501 Basic stats: COMPLETE Column stats: NONE
          Select Operator
            expressions: country (type: string), empno (type: int)
            outputColumnNames: _col0, _col1
            Statistics: Num rows: 33 Data size: 3501 Basic stats: COMPLETE Column stats: NONE
            ListSink

Regards
Vivek



Mime
View raw message