hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mithun Antony (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-23390) Duplicate entry for a table in TAB_COL_STATS
Date Thu, 14 May 2020 01:09:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-23390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106766#comment-17106766
] 

Mithun Antony commented on HIVE-23390:
--------------------------------------

Suggestion is to put a unique constraint in the TAB_COL_STATS  on columns  DB_NAME, TABLE_NAME,
COLUMN_NAME considering the fact that a there will be only one record for a column 

> Duplicate entry for a table in TAB_COL_STATS 
> ---------------------------------------------
>
>                 Key: HIVE-23390
>                 URL: https://issues.apache.org/jira/browse/HIVE-23390
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 2.3.4
>            Reporter: Mithun Antony
>            Priority: Major
>
> When *_analyze <table>_* command was executed from presto to update the stats of
a table for the first time from multiple cluster sharing the same Hive metastore. Duplicate
entry for the same table is inserted to the *_TAB_COL_STATS_* table.
> This lead to failure executing further *_analyze <table>_* commands. 
> {code:java}
> Query failed: Multiple entries with same key: dummy=HiveColumnStatistics{integerStatistics=Optional[IntegerStatistics{min=OptionalLong[1],
max=OptionalLong[1]}], doubleStatistics=Optional.empty, decimalStatistics=Optional.empty,
dateStatistics=Optional.empty, booleanStatistics=Optional.empty, maxValueSizeInBytes=OptionalLong.empty,
totalSizeInBytes=OptionalLong.empty, nullsCount=OptionalLong[0], distinctValuesCount=OptionalLong[1]}
and dummy=HiveColumnStatistics{integerStatistics=Optional[IntegerStatistics{min=OptionalLong[1],
max=OptionalLong[1]}], doubleStatistics=Optional.empty, decimalStatistics=Optional.empty,
dateStatistics=Optional.empty, booleanStatistics=Optional.empty, maxValueSizeInBytes=OptionalLong.empty,
totalSizeInBytes=OptionalLong.empty, nullsCount=OptionalLong[0], distinctValuesCount=OptionalLong[1]}.
> {code}
> Duplicate records in the *_TAB_COL_STATS_*
> {code:java}
> '7','default','dual','dummy','smallint','245671','1','1',NULL,NULL,NULL,NULL,'0','1',NULL,NULL,NULL,NULL,'1588345509'
>  '11','default','dual','dummy','smallint','245671','1','1',NULL,NULL,NULL,NULL,'0','1',NULL,NULL,NULL,NULL,'1588345509'{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message