ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Ozerov (JIRA)" <j...@apache.org>
Subject [jira] [Created] (IGNITE-6057) SQL: Full scan should be performed through data pages bypassing primary index
Date Mon, 14 Aug 2017 14:21:00 GMT
Vladimir Ozerov created IGNITE-6057:

             Summary: SQL: Full scan should be performed through data pages bypassing primary
                 Key: IGNITE-6057
                 URL: https://issues.apache.org/jira/browse/IGNITE-6057
             Project: Ignite
          Issue Type: Improvement
          Components: persistence, sql
    Affects Versions: 2.1
            Reporter: Vladimir Ozerov
             Fix For: 2.2

Currently both SQL full scan and {{CREATE INDEX}} commands iterate through primary index to
get all existing values. Consider that we have 10 entries per data page on average. In this
case we will have to read the same data page 10 times when reaching relevant keys in different
parts of index tree. This could be very inefficient on certain workloads.

We should iterate over data pages directly instead. This way a page with 10 entries will be
accessed only once. However, we should take cache groups in count - if there are too many
entries from other logical caches, this approach could make situation even worse, unless we
have a mechanism to skip unnecessary entries (or the whole pages!) efficiently.

Probably we should develop a cost-based model, which will take in count the following statistics:
1) Average entry size. The longer the entry, the lesser the benefit. Especially if overflow
pages are used frequently. 
2) Cache groups. Ideally, we should estimate number of entries from all logical caches. The
more entries from other caches, the lesser the benefit.

This message was sent by Atlassian JIRA

View raw message