hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oleksandr Diachenko (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HAWQ-1404) PXF to leverage file-level stats of ORC file and emit records for COUNT(*)
Date Fri, 24 Mar 2017 02:11:41 GMT
Oleksandr Diachenko created HAWQ-1404:
-----------------------------------------

             Summary: PXF to leverage file-level stats of ORC file and emit records for COUNT(*)
                 Key: HAWQ-1404
                 URL: https://issues.apache.org/jira/browse/HAWQ-1404
             Project: Apache HAWQ
          Issue Type: Improvement
          Components: PXF
            Reporter: Oleksandr Diachenko
            Assignee: Ed Espino


For cases when user issues COUNT(*) queries without WHERE clause PXF should be able to leverage
file-level stats for a ORC file and emit given number of records back to HAWQ, avoiding reading
actual tuples from disk. This should be a first step in enabling PXF to use ORC stats(file,
stripe and row group levels) so we can improve a wider range of aggregate queries.

So whenever PXF receives "count" as AGG-TYPE parameters value - it should optimize it by emitting
tuples using ORC file-level stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message