ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexey Goncharuk (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (IGNITE-9275) Introduce mechanism to fetch partition file via a p2p protocol
Date Wed, 15 Aug 2018 13:55:00 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alexey Goncharuk updated IGNITE-9275:
-------------------------------------
    Description: 
As a first step to estimate how much faster the file-rebalancing may be, I suggest to implement
a simple partition fetch procedure via the communication SPI extension: 
1) Node A sends a partition fetch request to node B 
2) Node B starts a checkpoint and creates a local copy of the partition. Note that during
the partition copy there might be concurrent ongoing checkpoints, this must be handled properly
3) Node B establishes a new TCP connection on the TCP communication port (handshake and verification
is assumed)
4) Node B calls transferFile (or native analogue, investigation needed) to send the partition
file in the most effective way
5) Node A writes the file to a specified location on the local file system

After this mechanics is implemented, we need to hack the rebalance code and use partition
fetch logic instead of regular rebalance to measure
1) How much faster (or slower) the new approach performs
2) How it affects the concurrent transactions in the grid

> Introduce mechanism to fetch partition file via a p2p protocol
> --------------------------------------------------------------
>
>                 Key: IGNITE-9275
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9275
>             Project: Ignite
>          Issue Type: Sub-task
>            Reporter: Alexey Goncharuk
>            Priority: Major
>
> As a first step to estimate how much faster the file-rebalancing may be, I suggest to
implement a simple partition fetch procedure via the communication SPI extension: 
> 1) Node A sends a partition fetch request to node B 
> 2) Node B starts a checkpoint and creates a local copy of the partition. Note that during
the partition copy there might be concurrent ongoing checkpoints, this must be handled properly
> 3) Node B establishes a new TCP connection on the TCP communication port (handshake and
verification is assumed)
> 4) Node B calls transferFile (or native analogue, investigation needed) to send the partition
file in the most effective way
> 5) Node A writes the file to a specified location on the local file system
> After this mechanics is implemented, we need to hack the rebalance code and use partition
fetch logic instead of regular rebalance to measure
> 1) How much faster (or slower) the new approach performs
> 2) How it affects the concurrent transactions in the grid



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message