ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Ozerov (JIRA)" <j...@apache.org>
Subject [jira] [Created] (IGNITE-6079) SQL: implement base table statistics
Date Wed, 16 Aug 2017 11:54:00 GMT
Vladimir Ozerov created IGNITE-6079:

             Summary: SQL: implement base table statistics
                 Key: IGNITE-6079
                 URL: https://issues.apache.org/jira/browse/IGNITE-6079
             Project: Ignite
          Issue Type: Task
          Components: sql
    Affects Versions: 2.1
            Reporter: Vladimir Ozerov
             Fix For: 2.2

Ignite lacks cost-based optimizer what doesn't allow us to build efficient execution plans.
Let's start moving in this direction.

The ticket is about creating local statistics for tables. In the first phase they will not
be shared between nodes, neither they will participate in query optimization. The ultimate
goal of this ticket is to start gathering some info in the background and provide necessary
internal infrastructure and APIs for that.

*1. API*
Let's start with a single method {{GridQueryProcessor.rebuildStatistics()}}, which will build
stats for all existing tables.

*2. Infrastructure*
- Statistics are transient, not persisted
- We need a background worker which will re-build them on regular basis and replace old with
new using copy-on-write approach
- Statistics are created for indexed (i.e. sorted) columns 
- Sampling should be used to avoid full table scan

*3. Statistics types*
- Height-based: the whole range is split into N pieces, so that exactly M/N entries are located
between X and X+1 piece, where M is number of records

One statistics type should be enough in the first iteration.

This message was sent by Atlassian JIRA

View raw message