hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15565) LLAP: GroupByOperator flushes hash table too frequently
Date Thu, 19 Jan 2017 04:06:26 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829282#comment-15829282
] 

Prasanth Jayachandran commented on HIVE-15565:
----------------------------------------------

This delays the memory check hoping memory will be freed up in the meantime. Although freeing
up of memory is not guaranteed and may not happen at all because of on-heap metadata cache
and when other executors are performing allocations. 

LGTM, +1. Pending tests.

> LLAP: GroupByOperator flushes hash table too frequently
> -------------------------------------------------------
>
>                 Key: HIVE-15565
>                 URL: https://issues.apache.org/jira/browse/HIVE-15565
>             Project: Hive
>          Issue Type: Bug
>          Components: llap
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>            Priority: Minor
>             Fix For: 2.2.0
>
>         Attachments: HIVE-15565.1.patch, HIVE-15565.2.patch
>
>
> {{GroupByOperator::isTez}} would be true in LLAP mode. Current memory computations can
go wrong with {{isTez}} checks in {{GroupByOperator}}. For e.g, in a LLAP instance with Xmx128G
and 12 executors, it would start flushing hash table for every record once it reaches around
42GB (hive.tez.container.size=7100, hive.map.aggr.hash.percentmemory=0.5).
> {noformat}
> 2017-01-08T23:40:21,339 INFO  [TezTaskRunner (1480722417364_1922_7_03_000004_1)] org.apache.hadoop.hive.ql.exec.GroupByOperator:
Hash Table flushed: new size = 0
> 2017-01-08T23:40:21,339 INFO  [TezTaskRunner (1480722417364_1922_7_03_000012_1)] org.apache.hadoop.hive.ql.exec.GroupByOperator:
Hash Table flushed: new size = 0
> 2017-01-08T23:40:21,339 INFO  [TezTaskRunner (1480722417364_1922_7_03_000004_1)] org.apache.hadoop.hive.ql.exec.GroupByOperator:
Hash Tbl flush: #hash table = 1
> 2017-01-08T23:40:21,339 INFO  [TezTaskRunner (1480722417364_1922_7_03_000012_1)] org.apache.hadoop.hive.ql.exec.GroupByOperator:
Hash Tbl flush: #hash table = 1
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message