hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Misha Dmitriev (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
Date Mon, 10 Sep 2018 19:57:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609712#comment-16609712
] 

Misha Dmitriev commented on HIVE-17684:
---------------------------------------

I've made the above changes - that was easy. However, when I tried to run some tests locally,
they failed. Turns out that {{org.apache.hadoop.util.GcTimeMonitor}} constructor has a {{Preconditions.checkArgument(maxGcTimePercentage
<= 100)}} check. Turns out that sanity check sometimes have unwanted effect...

In this situation, it looks like the fastest way to address the problem is to copy the above
GcTimeMonitor class into the Hive code base, and then modify it so that instead of the {{GarbageCollectionMXBean}}
it uses the older, battle-tested mechanism that's same as in the existing {{JvmPauseMonitor}}
class. Makes sense?

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -----------------------------------------------------
>
>                 Key: HIVE-17684
>                 URL: https://issues.apache.org/jira/browse/HIVE-17684
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Sahil Takiar
>            Assignee: Misha Dmitriev
>            Priority: Major
>         Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, HIVE-17684.03.patch, HIVE-17684.04.patch,
HIVE-17684.05.patch, HIVE-17684.06.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of the {{MapJoinMemoryExhaustionHandler}}.
This handler is meant to detect scenarios where the small table is taking too much space in
memory, in which case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic to estimate
how much memory the {{HashMap}} is consuming: {{MemoryMXBean#getHeapMemoryUsage().getUsed()
/ MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be inaccurate.
The value returned by this method returns all reachable and unreachable memory on the heap,
so there may be a bunch of garbage data, and the JVM just hasn't taken the time to reclaim
it all. This can lead to intermittent failures of this check even though a simple GC would
have reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. In Hive-on-MR
this probably made sense to use because every Hive task was run in a dedicated container,
so a Hive Task could assume it created most of the data on the heap. However, in Hive-on-Spark
there can be multiple Hive Tasks running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message