hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rajesh Balamohan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-13788) hive msck listpartitions need to make use of directSQL instead of datanucleus
Date Fri, 03 Jun 2016 12:31:59 GMT

     [ https://issues.apache.org/jira/browse/HIVE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rajesh Balamohan updated HIVE-13788:
------------------------------------
    Attachment: msck_call_stack_with_fix.png

Thanks for the patch [~hsubramaniyan]. With the patch, it invokes the query in the DB only
once. Attached the profiler output. FileSystem.exists() calls are expensive as it is contacting
S3 in my case.

> hive msck listpartitions need to make use of directSQL instead of datanucleus
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-13788
>                 URL: https://issues.apache.org/jira/browse/HIVE-13788
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Assignee: Hari Sankar Sivarama Subramaniyan
>            Priority: Minor
>         Attachments: HIVE-13788.1.patch, msck_call_stack_with_fix.png, msck_stack_trace.png
>
>
> Currently, for tables having 1000s of partitions too many DB calls are made via datanucleus.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message