ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Kalashnikov (Jira)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-12295) Faster index eviction
Date Thu, 17 Oct 2019 09:39:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953560#comment-16953560

Sergey Kalashnikov commented on IGNITE-12295:

The following changes were made to demonstrate the approach.

1) A special cursor {{PurgeCursor}} is introduced to B+Tree to clean up tree items according
to a provided filter.
The filter is passed a page IO object and item index. An existing {{TreeRowClosure}} is re-used
for the filter.

For the purpose of the index partition clearing the filter passed to the above cursor does
the following:
- reads the link of the page item;
- extracts partition id from the link value;
- determines if the partition is requested for clearing.
This code is made a part of {{H2Tree}} class (which represents one segment of the index),
method {{purge}}.

The purge cursor may delete multiple items from the page at once. To reflect that, the new
physical delta WAL record {{PurgeRecord}} is introduced.
The new record contains the indexes of all removed items and the new count of items that remains
in the page.

2) A new {{GridQueryIndexing}} method to provide the caller with a collection of runnables
and futures each responsible for clearing a single index.
{{List<IgniteBiTuple<Runnable, IgniteInternalFuture<Void>>> purgeIndexPartitions(CacheGroupContext
grp, Set<Integer> parts)}}

This specific choice of API hides the details of enumerating tables and indexes of cache group
from the caller and allows to move actual processing to some other context.

The implementation is done in {{IgniteH2Indexing}} class.

3) Changes in pipeline of {{PartitionsEvictManager}}.

An API entry point for starting index partition eviction mechanism on a cache group and listening
for single result.
{{public IgniteInternalFuture<Void> purgePartitionsExclusively(CacheGroupContext grp,
List<GridDhtLocalPartition> parts) throws IgniteCheckedException}}

It makes sure that all the given partitions are not reserved and are not being cleared at
the moment.
The {{clearFuture}} futures of every partition is reinitialized to prevent further clashes
with regular clearing requests.
After all partitions are available for purge, the index purge runnables provided by {{GridQueryIndexing}}
are converted to tasks that are put through usual queues/pipeline of {{PartitionsEvictManager}}.

Once all index purge tasks are completed, the additional tasks are run:
- to clean up all the CacheMapEntries for affected partitions;
- to clean up H2RowCache.

5) Limitations/applicability.
- It is assumed that no concurrent changes are made to affected partitions, otherwise there
is no guarantee that index is free of rows of such partitions after iteration is completed.
In current design of file-based rebalancing, the concurrent changes are not applied to the
data storage.
- Only applicable to indexes that use Ignite's internal B+Tree. Cache groups with full text
indexes cannot be cleared that way.
- Per-entry event notifications are missing, since we are not clearing actual data store partition.

> Faster index eviction
> ---------------------
>                 Key: IGNITE-12295
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12295
>             Project: Ignite
>          Issue Type: Sub-task
>            Reporter: Sergey Kalashnikov
>            Assignee: Sergey Kalashnikov
>            Priority: Major
> For the file-based rebalancing approach, it seems feasible to avoid iterating the old
partition data in order to clear the indexes.
> One can independently clear the shared index structures of all the rows referencing entries
from moving partitions by deducing partition id from the links in the leaf pages.
> The proposed algorithm is simple and takes the set of integer partition ids as an input:
> 1. Iterate over leaf pages of the index and remove items attributed to any of indicated
partitions, unless it is the only or the rightmost item on a page.
> 2. If the rightmost item (or the only item) on a page happens to belong to any of the
indicated partitions, employ a regular remove algorithm (descending from the root) so that
inner pages get correctly updated.
> Restart iteration from the leaf page where the removed item would be inserted (descend
from the root to find it).
> The use of such algorithm can be justified (as having performance advantage) when the
number of keys that'd be removed is bigger than the number of leaf pages in the index.

This message was sent by Atlassian Jira

View raw message