ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amelchev Nikita (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-9913) Prevent data updates blocking in case of backup BLT server node leave
Date Mon, 24 Jun 2019 11:43:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16871096#comment-16871096

Amelchev Nikita commented on IGNITE-9913:

Hi, [~ivan.glukos].
I found two possible blockers to do such lightweight PME without blocking updates:

1. Finalize partitions counter. It seems that we can't correctly collect gaps and process
them without completing all txs. See the {{GridDhtPartitionTopologyImpl#finalizeUpdateCounters}}

2. Apply update counters. We can't correctly set {{HWM}} counter if primary left the cluster
and sent updates to part of backups. Such updates can be processed later and break guarantee
that {{LWM<=HWM}}.

Could you take a look?

> Prevent data updates blocking in case of backup BLT server node leave
> ---------------------------------------------------------------------
>                 Key: IGNITE-9913
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9913
>             Project: Ignite
>          Issue Type: Improvement
>          Components: general
>            Reporter: Ivan Rakov
>            Assignee: Amelchev Nikita
>            Priority: Major
>             Fix For: 2.8
>         Attachments: 9913_yardstick.png, master_yardstick.png
>          Time Spent: 50m
>  Remaining Estimate: 0h
> Ignite cluster performs distributed partition map exchange when any server node leaves
or joins the topology.
> Distributed PME blocks all updates and may take a long time. If all partitions are assigned
according to the baseline topology and server node leaves, there's no actual need to perform
distributed PME: every cluster node is able to recalculate new affinity assigments and partition
states locally. If we'll implement such lightweight PME and handle mapping and lock requests
on new topology version correctly, updates won't be stopped (except updates of partitions
that lost their primary copy).

This message was sent by Atlassian JIRA

View raw message