hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vaibhav Gumashta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-10503) Aggregate stats cache: follow up optimizations
Date Mon, 27 Apr 2015 19:04:39 GMT

    [ https://issues.apache.org/jira/browse/HIVE-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14514724#comment-14514724
] 

Vaibhav Gumashta commented on HIVE-10503:
-----------------------------------------

cc [~thejas] [~mmokhtar]. I'll start working on this in few days.

> Aggregate stats cache: follow up optimizations
> ----------------------------------------------
>
>                 Key: HIVE-10503
>                 URL: https://issues.apache.org/jira/browse/HIVE-10503
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>    Affects Versions: 1.2.0
>            Reporter: Vaibhav Gumashta
>            Assignee: Vaibhav Gumashta
>             Fix For: 1.3.0
>
>
> Some follow up work items:
> 1. Estimate cache nodes from memory size - currently the user needs to specify size based
on #nodes.
> 2. Make the AggregateStatsCache#add method asynchronous - adding to cache can happen
in a new thread.
> 3. Based on perf testing, explore an alternate data structure for the node list per cache
key.
> 4. Explore ideas to reduce locking granularity of the value list per cache key.
> 5. There is an O(n*n) loop while finding the match - that should go away.
> 6. Single call to DB to get aggregate for columns not in cache.
> 7. Organize metrics capturing in a better way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message