hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-10068) LLAP: adjust allocation after decompression
Date Wed, 03 Jun 2015 20:03:38 GMT

    [ https://issues.apache.org/jira/browse/HIVE-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571562#comment-14571562
] 

Sergey Shelukhin commented on HIVE-10068:
-----------------------------------------

Update from some test runs on TPCDS and TPCH queries, we waste around 15% allocated memory
due to buddy allocator granularity:
{noformat}
$ sed -E "s/.*ALLOCATED_BYTES=([0-9]+).*/\1/" lrfu1.log | awk '{s+=$1}END{print s}'
278162046976
$ sed -E "s/.*ALLOCATED_USED_BYTES=([0-9]+).*/\1/" lrfu1.log | awk '{s+=$1}END{print s}'
238565954908
{noformat}

Some of that is obviously unavoidable, but some could be avoided by implementing this. However,
it's not as bad as I expected (bad results can be seen on very small datasets were stripes/RGs
are routinely smaller than compression block size.

> LLAP: adjust allocation after decompression
> -------------------------------------------
>
>                 Key: HIVE-10068
>                 URL: https://issues.apache.org/jira/browse/HIVE-10068
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergey Shelukhin
>
> We don't know decompressed size of a compression buffer in ORC, all we know is the file-level
compression buffer size. For many files, compression buffers can be smaller than that because
of compact encoding, or because compression block ends for other reasons (different streams,
etc. - "present" streams for example are very small).
> BuddyAllocator should be able to accept back parts of the allocated memory (e.g. allocate
256Kb with minimum allocation of 32Kb, decompress 45Kb, return the last 192Kb as 64+128Kb).
For generality (this depends on implementation), we can make an API like "offer", and allocator
can decide to take back however much it can.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message