hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "anishek (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (HIVE-17814) Reduce Memory footprint for large database bootstrap replication load
Date Mon, 16 Oct 2017 06:37:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

anishek reassigned HIVE-17814:
------------------------------


> Reduce Memory footprint for large database bootstrap replication load 
> ----------------------------------------------------------------------
>
>                 Key: HIVE-17814
>                 URL: https://issues.apache.org/jira/browse/HIVE-17814
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 3.0.0
>            Reporter: anishek
>            Assignee: anishek
>             Fix For: 3.0.0
>
>
> As part of HIVE-16896 we are doing dynamic Query Task generation for bootstrap repl load.
This was done since the number of tasks for large databases will generate a very large graph
with hundreds of thousands of objects, this would put additional memory pressure on hive.

> The execution hook's however still keep reference to the query plan which gets dynamically
modified and at the end of all task execution hive will have the whole DAG in memory which
is what we have to prevent, Additionally for PostExecution Hive hooks we are additionally
storing the TaskRunner objects for each task that is executed. 
> We have to handle these issues to prevent excessive memory usage for replication specifically
bootstrap replication. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message