arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Grove (Jira)" <>
Subject [jira] [Created] (ARROW-6690) [Rust] [DataFusion] HashAggregate without GROUP BY should use SIMD
Date Wed, 25 Sep 2019 13:59:00 GMT
Andy Grove created ARROW-6690:

             Summary: [Rust] [DataFusion] HashAggregate without GROUP BY should use SIMD
                 Key: ARROW-6690
             Project: Apache Arrow
          Issue Type: Sub-task
          Components: Rust, Rust - DataFusion
            Reporter: Andy Grove
             Fix For: 1.0.0

Currently the implementation of HashAggregate in the new physical plan uses the same logic
regardless of whether a grouping expression is used.

For the case where there is no grouping expression, such as "SELECT SUM(a) FROM b" we can
use the compute kernels to perform an aggregate operation on each batch rather than iterating
over each row and accumulating individual values.

This optimization already exists in the original implementation of aggregate queries direct
from the logical plan.

This message was sent by Atlassian Jira

View raw message