ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Ozerov (JIRA)" <j...@apache.org>
Subject [jira] [Created] (IGNITE-6089) SQL: Improve query parallelism architecture
Date Wed, 16 Aug 2017 14:54:00 GMT
Vladimir Ozerov created IGNITE-6089:
---------------------------------------

             Summary: SQL: Improve query parallelism architecture
                 Key: IGNITE-6089
                 URL: https://issues.apache.org/jira/browse/IGNITE-6089
             Project: Ignite
          Issue Type: Task
          Components: sql
    Affects Versions: 2.1
            Reporter: Vladimir Ozerov


Currently query parallelism implement with static split of all indexes (including PK) for
cache. This approach has several major disadvantages:
1) It improves scans, but slows down index and range lookups
2) Tables with different DOP cannot be used in the same query

We need to fix that. Proposed plan:
1) No more index splits, ever - there is one and only one index always
2) Use preliminary execution plan, statistics (IGNITE-6079), CPU cores count and CPU load
to estimate whether query will benefit from parallelism. 
3) if yes - split node-s single map query into several independent pieces. 

Splitting can be achieved in one of the following ways:
1) Partition-based: e.g. if node owns partitions A, B, C and D, then we can split it to two
queries - one over (A, B), another over (C, D). This could be useful for pure scans (e.g.
DWH)
2) Histogram-based: e.g. if we have a query {{SELECT ... WHERE salary > 50}}, and we know
salary distribution, we can split it into {{WHERE salary > 50 AND salary <= 200}} and
{{WHERE salary > 200}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message