impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Behm (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (IMPALA-4986) Use Parquet statistics when evaluating min/max/count aggregates
Date Fri, 17 Mar 2017 21:35:41 GMT

     [ https://issues.apache.org/jira/browse/IMPALA-4986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alexander Behm reassigned IMPALA-4986:
--------------------------------------

    Assignee: Taras Bobrovytsky

> Use Parquet statistics when evaluating min/max/count aggregates
> ---------------------------------------------------------------
>
>                 Key: IMPALA-4986
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4986
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 2.9.0
>            Reporter: Lars Volker
>            Assignee: Taras Bobrovytsky
>              Labels: parquet, performance, ramp-up
>
> There are various ways in which Parquet statistics such as num_rows and also parquet::Statistics
can be used to speed up aggregation queries with min/max/count. Some of the improvements can
be done at execution-time only, others also need query-plan modifications. The subtasks illustrate
the various optimization opportunities/dimensions, and can be tackled separately.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message