From issues-return-98434-archive-asf-public=cust-asf.ponee.io@ignite.apache.org Thu Sep 5 13:30:02 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 16275180656 for ; Thu, 5 Sep 2019 15:30:01 +0200 (CEST) Received: (qmail 31113 invoked by uid 500); 5 Sep 2019 21:42:46 -0000 Mailing-List: contact issues-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list issues@ignite.apache.org Received: (qmail 31097 invoked by uid 99); 5 Sep 2019 21:42:46 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Sep 2019 21:42:46 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id B1302E312E for ; Thu, 5 Sep 2019 13:30:00 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 2703D780697 for ; Thu, 5 Sep 2019 13:30:00 +0000 (UTC) Date: Thu, 5 Sep 2019 13:30:00 +0000 (UTC) From: "Pavel Pereslegin (Jira)" To: issues@ignite.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (IGNITE-12069) Implement file rebalancing management MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/IGNITE-12069?page=3Dcom.atlass= ian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Pereslegin updated IGNITE-12069: -------------------------------------- Description:=20 {{Preloader}} should be able to do the following: # build the map of partitions and corresponding supplier nodes from which = partitions will be loaded; # switch cache data storage to {{no-op}} and back to original (HWM must be= fixed here for the needs of historical rebalance) under the checkpoint and= keep the partition update counter for each partition; # run async the eviction indexes for the list of collected partitions; # send a request message to each node one by one with the list of partitio= ns to load; # wait for files received (listening for the transmission handler); # run rebuild indexes async over the receiving partitions; # run historical rebalance from LWM to HWM collected above (LWM can be rea= d from the received file meta page); h5. Stage 1. implement "read-only" mode for cache data store. Implement dat= a store reinitialization on the updated persistence file. h6. Tests: - Switching under load. - Check re-initialization of partition on new file. - Check that in read-only mode ** indexes are not updated ** update counter is updated ** cache entries eviction works fine ** tx/atomic updates on this partition works fine in cluster h5. Stage 2. Build Map for request partitions by node, add message that wil= l be sent to the supplier. Send a demand request, handle the response, swit= ch datastore when file received. h6. Tests: - Check partition consistency after receiving a file. - File transmission under load. - Failover - some of the partitions have been switched, the node has been = restarted, rebalancing is expected to continue only for fully loaded large = partitions through the historical rebalance, for the rest of partitions it = should restart from the beginning.=C2=A0 h5. Stage 3. Add WAL history reservation on supplier. Add historical rebala= nce triggering (LWM (partition) - HWM (read-only)). h6. Tests: - File rebalancing under load and without on atomic/tx caches. (check exis= ting PDS-enabled rebalancing tests). - Ensure that MVCC groups use regular rebalancing. - The rebalancing on the unstable topology and failures of the supplier/de= mander nodes at different stages. - (compatibility) The old nodes should use regular rebalancing. h5. Stage 4 Eviction and rebuild of indexes. h6. Tests: - File rebalancing of caches with H2 indexes. - Check consistency of H2 indexes. was: {{Preloader}} should be able to do the following: # build the map of partitions and corresponding supplier nodes from which = partitions will be loaded; # switch cache data storage to {{no-op}} and back to original (HWM must be= fixed here for the needs of historical rebalance) under the checkpoint and= keep the partition update counter for each partition; # run async the eviction indexes for the list of collected partitions; # send a request message to each node one by one with the list of partitio= ns to load; # wait for files received (listening for the transmission handler); # run rebuild indexes async over the receiving partitions; # run historical rebalance from LWM to HWM collected above (LWM can be rea= d from the received file meta page); h5. Stage 1. implement "read-only" mode for cache data store. Implement dat= a store reinitialization on the updated persistence file. h6. Tests: - Switching under load. - Check re-initialization of partition on new file. - Check that in read-only mode ** indexes are not updated ** update counter is updated ** eviction works fine ** tx/atomic updates on this partition works fine in cluster h5. Stage 2. Build Map for request partitions by node, add message that wil= l be sent to the supplier. Send a demand request, handle the response, swit= ch datastore when file received. h6. Tests: - Check partition consistency after receiving a file. - File transmission under load. - Failover - some of the partitions have been switched, the node has been = restarted, rebalancing is expected to continue only for fully loaded large = partitions through the historical rebalance, for the rest of partitions it = should restart from the beginning.=C2=A0 h5. Stage 3. Add WAL history reservation on supplier. Add historical rebala= nce triggering (LWM (partition) - HWM (read-only)). h6. Tests: - File rebalancing under load and without on atomic/tx caches. (check exis= ting PDS-enabled rebalancing tests). - Ensure that MVCC groups use regular rebalancing. - The rebalancing on the unstable topology and failures of the supplier/de= mander nodes at different stages. - (compatibility) The old nodes should use regular rebalancing. h5. Stage 4 Eviction and rebuild of indexes. h6. Tests: - File rebalancing of caches with H2 indexes. - Check consistency of H2 indexes. > Implement file rebalancing management > ------------------------------------- > > Key: IGNITE-12069 > URL: https://issues.apache.org/jira/browse/IGNITE-12069 > Project: Ignite > Issue Type: Sub-task > Reporter: Maxim Muzafarov > Assignee: Pavel Pereslegin > Priority: Major > Labels: iep-28 > > {{Preloader}} should be able to do the following: > # build the map of partitions and corresponding supplier nodes from whic= h partitions will be loaded; > # switch cache data storage to {{no-op}} and back to original (HWM must = be fixed here for the needs of historical rebalance) under the checkpoint a= nd keep the partition update counter for each partition; > # run async the eviction indexes for the list of collected partitions; > # send a request message to each node one by one with the list of partit= ions to load; > # wait for files received (listening for the transmission handler); > # run rebuild indexes async over the receiving partitions; > # run historical rebalance from LWM to HWM collected above (LWM can be r= ead from the received file meta page); > h5. Stage 1. implement "read-only" mode for cache data store. Implement d= ata store reinitialization on the updated persistence file. > h6. Tests: > - Switching under load. > - Check re-initialization of partition on new file. > - Check that in read-only mode > ** indexes are not updated > ** update counter is updated > ** cache entries eviction works fine > ** tx/atomic updates on this partition works fine in cluster > h5. Stage 2. Build Map for request partitions by node, add message that w= ill be sent to the supplier. Send a demand request, handle the response, sw= itch datastore when file received. > h6. Tests: > - Check partition consistency after receiving a file. > - File transmission under load. > - Failover - some of the partitions have been switched, the node has bee= n restarted, rebalancing is expected to continue only for fully loaded larg= e partitions through the historical rebalance, for the rest of partitions i= t should restart from the beginning.=C2=A0 > h5. Stage 3. Add WAL history reservation on supplier. Add historical reba= lance triggering (LWM (partition) - HWM (read-only)). > h6. Tests: > - File rebalancing under load and without on atomic/tx caches. (check ex= isting PDS-enabled rebalancing tests). > - Ensure that MVCC groups use regular rebalancing. > - The rebalancing on the unstable topology and failures of the supplier/= demander nodes at different stages. > - (compatibility) The old nodes should use regular rebalancing. > h5. Stage 4 Eviction and rebuild of indexes. > h6. Tests: > - File rebalancing of caches with H2 indexes. > - Check consistency of H2 indexes. -- This message was sent by Atlassian Jira (v8.3.2#803003)