hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dinesh Garg (Jira)" <>
Subject [jira] [Updated] (HIVE-22947) The method getTableObjectsByName() in is slow
Date Tue, 03 Mar 2020 23:47:00 GMT


Dinesh Garg updated HIVE-22947:
    Priority: Critical  (was: Major)

> The method getTableObjectsByName() in is slow
> ----------------------------------------------------------------------
>                 Key: HIVE-22947
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Standalone Metastore
>            Reporter: Fang-Yu Rao
>            Priority: Critical
>         Attachments: Benchmark_related_to_IMPALA-9363.pdf
> The RPC of {{getTableObjectsByName()}} in {{}} ([])
is very slow. Specifically, according to an empirical evaluation, to load the complete metadata
of all the tables under a database consisting of 40,000 tables, it takes at least 170 seconds
for {{getTableObjectsByName()}} to complete, whereas it only takes less than 0.5 second for
{{getAllTables()}} ([])
on the same machine.
> In some use cases, not all the fields under the class of {{org.apache.hadoop.hive.metastore.api.Table}}
are required. For instance, if a client would only like to determine the type of a table,
e.g., an HDFS table or a Kudu table, then it should suffice to only load the field of {{sd}},
which is of class {{org.apache.hadoop.hive.metastore.api.StorageDescriptor}}. It would be
great if {{getTableObjectsByName()}} could be made more fine-grained so that only those required
fields specified by the client are retrieved, which could also possibly reduce the time spent
on this RPC.
> A spreadsheet is also attached ([^Benchmark_related_to_IMPALA-9363.pdf]), where the detailed
experimental results are provided. In the experiment, as a client of Hive metastore, the {{catalogd}}
of Impala calls {{getTableObjectsByName()}} to retrieve the complete metadata of tables under
a database having 40,000 tables.

This message was sent by Atlassian Jira

View raw message