ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "aaron@tophold.com" <aa...@tophold.com>
Subject Re: Re: What's the best practice to init the cache when in cluster env?
Date Wed, 27 Sep 2017 04:10:21 GMT
Got, thanks Denis! very appreciated !


Regards
Aaron


aaron@tophold.com
 
From: Denis Magda
Date: 2017-09-27 06:03
To: user
Subject: Re: What's the best practice to init the cache when in cluster env?
Aaron,

It’s safe to preload the data on a changing cluster topology. Both data stream and CacheStore
approaches handle this:
https://apacheignite.readme.io/docs/data-loading

During the rebalancing a node evicts data it’s no longer primary or backup one. You don’t
need to worry about this, it’s Ignite’s job.

—
Denis M.

On Sep 26, 2017, at 1:46 AM, aaron@tophold.com wrote:

Thanks Denis, 

The real tough issue is we not sure when the entire cluster may be ready, as we may increase
or decrease the nodes at run-time. 

Another question is , if I load the data once on first started node,  after other nodes bring
up, and after re-balance, will the primary nodes evict the entries not below to it?

As we have regular aggregated run locally on each nodes, we do not want this will be too heavy
on the first node. 


Regards
Aaron


aaron@tophold.com
 
From: Denis Mekhanikov
Date: 2017-09-25 19:46
To: user
Subject: Re: What's the best practice to init the cache when in cluster env?
Hi Aaron!

There are two good options for data loading: using DataStreamer or IgniteCache.loadCache(...).
The second option is good when initial data is stored in some database.

If you worry about overhead on data rebalancing, you can start the cluster and start streaming
data once all nodes are up. In this case records will appear at their final destination at
once, without need to move to other nodes.

Denis

пн, 25 сент. 2017 г. в 14:31, aaron@tophold.com <aaron@tophold.com>:
hi All, 

If we have dozen of nodes to cache millions data from DB;

When init,  what's the best way to loading those data? we use the data streamer to load data,
while all our entry include a partition ID when insert into DB. 

As the nodes are started one by one, if loading from one Node and then re-balance this seems
impossible & wasting. 

Not sure whether there any guideline or best practice/advice for such scenario.

Thanks for our time!


Regards
Aaron


aaron@tophold.com

Mime
View raw message