hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pengcheng Xiong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12411) Remove counter based stats collection mechanism
Date Tue, 24 Nov 2015 09:28:10 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024085#comment-15024085
] 

Pengcheng Xiong commented on HIVE-12411:
----------------------------------------

The test cases failures are unrelated besides metadataonly1.q and optimize_nullscan.q. Analyze
the golden file difference for both of them and update them accordingly. Pushed to master.
Thanks [~ashutoshc] for the review!

> Remove counter based stats collection mechanism
> -----------------------------------------------
>
>                 Key: HIVE-12411
>                 URL: https://issues.apache.org/jira/browse/HIVE-12411
>             Project: Hive
>          Issue Type: Task
>          Components: Statistics
>            Reporter: Pengcheng Xiong
>            Assignee: Pengcheng Xiong
>         Attachments: HIVE-12411.01.patch, HIVE-12411.02.patch
>
>
> Following HIVE-12005, HIVE-12164, we have removed jdbc and hbase stats collection mechanism.
Now we are targeting counter based stats collection mechanism. The main advantages are as
follows (1) counter based stats has limitation on the length of the counter itself, if it
is too long, MD5 will be applied. (2) when there are a large number of partitions and columns,
we need to create a large number of counters in memory. This will put a heavy load on the
M/R AM or Tez AM etc. FS based stats will do a better job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message