hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (Jira)" <>
Subject [jira] [Updated] (HIVE-23166) Guard VGB from flushing too often
Date Sun, 12 Apr 2020 22:07:00 GMT


Ashutosh Chauhan updated HIVE-23166:
    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Panos!

> Guard VGB from flushing too often
> ---------------------------------
>                 Key: HIVE-23166
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: llap
>    Affects Versions: 4.0.0
>            Reporter: Panagiotis Garefalakis
>            Assignee: Panagiotis Garefalakis
>            Priority: Major
>         Attachments: HIVE-23166.01.patch, HIVE-23166.02.patch, HIVE-23166.03.patch
> The existing flush logic in our VectorGroupByOperator is completely static.
>  It depends on the: number of HtEntries (*hive.vectorized.groupby.maxentries*) and the
MAX memory threshold (by default 90% of available memory)
> Assuming that we are not memory constrained the periodicity of flushing is currently
dictated by the static number of entries (1M by default) which can be also misconfigured to
a very low value.
> I am proposing along with maxHtEntries, to also take into account current memory usage,
to avoid flushing too ofter as it can hurt op throughput for particular workloads.

This message was sent by Atlassian Jira

View raw message