From issues-return-98434-archive-asf-public=cust-asf.ponee.io@ignite.apache.org  Thu Sep  5 13:30:02 2019
Return-Path: <issues-return-98434-archive-asf-public=cust-asf.ponee.io@ignite.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 16275180656
	for <archive-asf-public@cust-asf.ponee.io>; Thu,  5 Sep 2019 15:30:01 +0200 (CEST)
Received: (qmail 31113 invoked by uid 500); 5 Sep 2019 21:42:46 -0000
Mailing-List: contact issues-help@ignite.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:issues-help@ignite.apache.org>
List-Unsubscribe: <mailto:issues-unsubscribe@ignite.apache.org>
List-Post: <mailto:issues@ignite.apache.org>
List-Id: <issues.ignite.apache.org>
Reply-To: dev@ignite.apache.org
Delivered-To: mailing list issues@ignite.apache.org
Received: (qmail 31097 invoked by uid 99); 5 Sep 2019 21:42:46 -0000
Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139)
    by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Sep 2019 21:42:46 +0000
Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172])
	by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id B1302E312E
	for <issues@ignite.apache.org>; Thu,  5 Sep 2019 13:30:00 +0000 (UTC)
Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1])
	by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 2703D780697
	for <issues@ignite.apache.org>; Thu,  5 Sep 2019 13:30:00 +0000 (UTC)
Date: Thu, 5 Sep 2019 13:30:00 +0000 (UTC)
From: "Pavel Pereslegin (Jira)" <jira@apache.org>
To: issues@ignite.apache.org
Message-ID: <JIRA.13250746.1565776680000.6670.1567690200159@Atlassian.JIRA>
In-Reply-To: <JIRA.13250746.1565776680000@Atlassian.JIRA>
References: <JIRA.13250746.1565776680000@Atlassian.JIRA> <JIRA.13250746.1565776680900@jira-he-de>
Subject: [jira] [Updated] (IGNITE-12069) Implement file rebalancing
 management
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394


     [ https://issues.apache.org/jira/browse/IGNITE-12069?page=3Dcom.atlass=
ian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pavel Pereslegin updated IGNITE-12069:
--------------------------------------
    Description:=20
{{Preloader}} should be able to do the following:
 # build the map of partitions and corresponding supplier nodes from which =
partitions will be loaded;
 # switch cache data storage to {{no-op}} and back to original (HWM must be=
 fixed here for the needs of historical rebalance) under the checkpoint and=
 keep the partition update counter for each partition;
 # run async the eviction indexes for the list of collected partitions;
 # send a request message to each node one by one with the list of partitio=
ns to load;
 # wait for files received (listening for the transmission handler);
 # run rebuild indexes async over the receiving partitions;
 # run historical rebalance from LWM to HWM collected above (LWM can be rea=
d from the received file meta page);

h5. Stage 1. implement "read-only" mode for cache data store. Implement dat=
a store reinitialization on the updated persistence file.
h6. Tests:
 - Switching under load.
 - Check re-initialization of partition on new file.
 - Check that in read-only mode
 ** indexes are not updated
 ** update counter is updated
 ** cache entries eviction works fine
 ** tx/atomic updates on this partition works fine in cluster

h5. Stage 2. Build Map for request partitions by node, add message that wil=
l be sent to the supplier. Send a demand request, handle the response, swit=
ch datastore when file received.
h6. Tests:
 - Check partition consistency after receiving a file.
 - File transmission under load.
 - Failover - some of the partitions have been switched, the node has been =
restarted, rebalancing is expected to continue only for fully loaded large =
partitions through the historical rebalance, for the rest of partitions it =
should restart from the beginning.=C2=A0

h5. Stage 3. Add WAL history reservation on supplier. Add historical rebala=
nce triggering (LWM (partition) - HWM (read-only)).
h6. Tests:
 - File rebalancing under load and without on atomic/tx caches. (check exis=
ting PDS-enabled rebalancing tests).
 - Ensure that MVCC groups use regular rebalancing.
 - The rebalancing on the unstable topology and failures of the supplier/de=
mander nodes at different stages.
 - (compatibility) The old nodes should use regular rebalancing.

h5. Stage 4 Eviction and rebuild of indexes.
h6. Tests:
 - File rebalancing of caches with H2 indexes.
 - Check consistency of H2 indexes.

  was:
{{Preloader}} should be able to do the following:
 # build the map of partitions and corresponding supplier nodes from which =
partitions will be loaded;
 # switch cache data storage to {{no-op}} and back to original (HWM must be=
 fixed here for the needs of historical rebalance) under the checkpoint and=
 keep the partition update counter for each partition;
 # run async the eviction indexes for the list of collected partitions;
 # send a request message to each node one by one with the list of partitio=
ns to load;
 # wait for files received (listening for the transmission handler);
 # run rebuild indexes async over the receiving partitions;
 # run historical rebalance from LWM to HWM collected above (LWM can be rea=
d from the received file meta page);

h5. Stage 1. implement "read-only" mode for cache data store. Implement dat=
a store reinitialization on the updated persistence file.
h6. Tests:
 - Switching under load.
 - Check re-initialization of partition on new file.
 - Check that in read-only mode
 ** indexes are not updated
 ** update counter is updated
 ** eviction works fine
 ** tx/atomic updates on this partition works fine in cluster

h5. Stage 2. Build Map for request partitions by node, add message that wil=
l be sent to the supplier. Send a demand request, handle the response, swit=
ch datastore when file received.
h6. Tests:
 - Check partition consistency after receiving a file.
 - File transmission under load.
 - Failover - some of the partitions have been switched, the node has been =
restarted, rebalancing is expected to continue only for fully loaded large =
partitions through the historical rebalance, for the rest of partitions it =
should restart from the beginning.=C2=A0

h5. Stage 3. Add WAL history reservation on supplier. Add historical rebala=
nce triggering (LWM (partition) - HWM (read-only)).
h6. Tests:
 - File rebalancing under load and without on atomic/tx caches. (check exis=
ting PDS-enabled rebalancing tests).
 - Ensure that MVCC groups use regular rebalancing.
 - The rebalancing on the unstable topology and failures of the supplier/de=
mander nodes at different stages.
 - (compatibility) The old nodes should use regular rebalancing.

h5. Stage 4 Eviction and rebuild of indexes.
h6. Tests:
 - File rebalancing of caches with H2 indexes.
 - Check consistency of H2 indexes.


> Implement file rebalancing management
> -------------------------------------
>
>                 Key: IGNITE-12069
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12069
>             Project: Ignite
>          Issue Type: Sub-task
>            Reporter: Maxim Muzafarov
>            Assignee: Pavel Pereslegin
>            Priority: Major
>              Labels: iep-28
>
> {{Preloader}} should be able to do the following:
>  # build the map of partitions and corresponding supplier nodes from whic=
h partitions will be loaded;
>  # switch cache data storage to {{no-op}} and back to original (HWM must =
be fixed here for the needs of historical rebalance) under the checkpoint a=
nd keep the partition update counter for each partition;
>  # run async the eviction indexes for the list of collected partitions;
>  # send a request message to each node one by one with the list of partit=
ions to load;
>  # wait for files received (listening for the transmission handler);
>  # run rebuild indexes async over the receiving partitions;
>  # run historical rebalance from LWM to HWM collected above (LWM can be r=
ead from the received file meta page);
> h5. Stage 1. implement "read-only" mode for cache data store. Implement d=
ata store reinitialization on the updated persistence file.
> h6. Tests:
>  - Switching under load.
>  - Check re-initialization of partition on new file.
>  - Check that in read-only mode
>  ** indexes are not updated
>  ** update counter is updated
>  ** cache entries eviction works fine
>  ** tx/atomic updates on this partition works fine in cluster
> h5. Stage 2. Build Map for request partitions by node, add message that w=
ill be sent to the supplier. Send a demand request, handle the response, sw=
itch datastore when file received.
> h6. Tests:
>  - Check partition consistency after receiving a file.
>  - File transmission under load.
>  - Failover - some of the partitions have been switched, the node has bee=
n restarted, rebalancing is expected to continue only for fully loaded larg=
e partitions through the historical rebalance, for the rest of partitions i=
t should restart from the beginning.=C2=A0
> h5. Stage 3. Add WAL history reservation on supplier. Add historical reba=
lance triggering (LWM (partition) - HWM (read-only)).
> h6. Tests:
>  - File rebalancing under load and without on atomic/tx caches. (check ex=
isting PDS-enabled rebalancing tests).
>  - Ensure that MVCC groups use regular rebalancing.
>  - The rebalancing on the unstable topology and failures of the supplier/=
demander nodes at different stages.
>  - (compatibility) The old nodes should use regular rebalancing.
> h5. Stage 4 Eviction and rebuild of indexes.
> h6. Tests:
>  - File rebalancing of caches with H2 indexes.
>  - Check consistency of H2 indexes.


--
This message was sent by Atlassian Jira
(v8.3.2#803003)