impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Behm (JIRA)" <j...@apache.org>
Subject [jira] [Created] (IMPALA-5096) Use parquet::Statistics for min/max aggregates when only a subset of scan columns have stats
Date Fri, 17 Mar 2017 21:21:41 GMT
Alexander Behm created IMPALA-5096:
--------------------------------------

             Summary: Use parquet::Statistics for min/max aggregates when only a subset of
scan columns have stats
                 Key: IMPALA-5096
                 URL: https://issues.apache.org/jira/browse/IMPALA-5096
             Project: IMPALA
          Issue Type: Sub-task
          Components: Backend
    Affects Versions: Impala 2.8.0
            Reporter: Alexander Behm


If some columns do not have parquet::Statistics, then it is still possible to use the stats
of those columns that do have them, but with more effort. For those columns that have stats,
we can populate the scanner's template tuple with the stats values, and avoid scanning/materializing
those columns. We still need to scan the columns that do not have stats.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message