impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taras Bobrovytsky (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (IMPALA-4787) Optimize APPX_MEDIAN() mem usage in case of many grouping keys
Date Fri, 17 Mar 2017 00:05:41 GMT

     [ https://issues.apache.org/jira/browse/IMPALA-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Taras Bobrovytsky resolved IMPALA-4787.
---------------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.9.0

{code}
commit 1f4c37ab7e7e0bcc832e94f38fcf0a24970ae3c2
Author: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Date:   Wed Jan 4 14:33:08 2017 -0800

    IMPALA-3586: Implement union passthrough
    
    The union node acts as pass through operator and forwards row batches
    from it's children without materializing. This is done in the case
    when the child's tuple layout is identical to union node tuple layout
commit 1f4c37ab7e7e0bcc832e94f38fcf0a24970ae3c2
Author: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Date:   Wed Jan 4 14:33:08 2017 -0800

    IMPALA-3586: Implement union passthrough
    
    The union node acts as pass through operator and forwards row batches
    from it's children without materializing. This is done in the case
    when the child's tuple layout is identical to union node tuple layout
    and no functions need to be applied to the child row batches.
    
    Removed operand reordering in the FE because it's simpler and safer to
    handle all passthrough children before non-passthrough children in the
    BE. The recent improvements to memory management allowed us to remove
    this requirement.
    
    A new query option DISABLE_UNION_PASSTHROUGH was added in this patch
    as a precaution and for testing purposes.
    
    Testing:
    - Added new planner and end to end tests that cover the new
      functionality.
    - Updated existing tests to reflect the new behavior.
{code}

> Optimize APPX_MEDIAN() mem usage in case of many grouping keys
> --------------------------------------------------------------
>
>                 Key: IMPALA-4787
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4787
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.8.0
>            Reporter: Marcell Szabo
>            Assignee: Taras Bobrovytsky
>            Priority: Critical
>              Labels: usability
>             Fix For: Impala 2.9.0
>
>
> APPX_MEDIAN uses a lot of memory per grouping key. It allocates space for 20,000 samples
per grouping key to estimate the median. The current implementation targeted towards non-grouping
aggregations or aggregations with relatively few distinct grouping keys.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message