ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Magda <dma...@apache.org>
Subject Re: CacheStore's Performance Drops Dramatically - Why?
Date Thu, 04 May 2017 19:43:54 GMT
Looks like the naming of ‘getWriteBehindFlushSize’ method is totally wrong. It confuses
so many people. However, if we refer to the documentation of this method or look into the
source code we will find out that it sets the maximum size of the write-behind queue/buffer
on a single node. Once this size is reached data will be flushed to a storage in the sync
mode.

So, you need to set the flush size (maximum queue/buffer size) to a bigger value if you can’t
keep up with updates and always switch to the sync mode.

In any case, I’ve created a ticket to address both issues discussed here:
https://issues.apache.org/jira/browse/IGNITE-5173 <https://issues.apache.org/jira/browse/IGNITE-5173>

Thanks for your patience.

—
Denis

> On May 3, 2017, at 10:10 AM, Jessie Lin <jessie.jianwei.lin@gmail.com> wrote:
> 
> I thought flushsize could be set as several times higher than the batch size is that
in a cluster, data nodes would flush in parallel. For example there's a cluster with 10 nodes,
and flushSize is 10240, thread count = 2, batch size = 512. Then each node would flush out
in 2 thread, and each thread flushes out in batch of 512. 
> 
> Could someone confirms or clarify the understanding? Thank you!
> 
> On Wed, May 3, 2017 at 12:16 AM, Matt <dromitlabs@gmail.com <mailto:dromitlabs@gmail.com>>
wrote:
> In fact, I don't see why you would need both batchSize and flushSize. If I got it right,
only the min of them would be used by Ignite to know when to flush, why do we have both in
the first place?
> 
> In case they're both necessary for a reason I'm not seeing, I still wonder if the default
values should be batchSize > flushSize as I think or not.
> 
> On Wed, May 3, 2017 at 3:26 AM, Matt <dromitlabs@gmail.com <mailto:dromitlabs@gmail.com>>
wrote:
> I'm writing to confirm I managed to fix my problem by fine tuning the config params for
the write behind cache until the performance was fine. I still see single element inserts
from time to time, but just a few of them every now and then not like before. You should definitely
avoid synchronous single elements insertions, I hope that changes in future versions.
> 
> Regarding writeBehindBatchSize and writeBehindFlushSize, I don't see the point of setting
both values when batchSize < flushSize (default values are 512 and 10240 respectively).
If I'm not wrong, the cache is flushed whenever the its size is equal to min(batchSize, flushSize).
Since batchSize is less than flushSize, flushSize is never really used and the size of the
flush is controlled by the size of the cache itself only.
> 
> That is how it works by default, on the other hand if we swap their values (ie, batchSize=10240
and flushSize=512) the behavior would be the same (Ignite would call writeAll() with 512 elements
each time), but the number of elements flushed would be controlled by the correct variable
(ie, flushSize).
> 
> Were the default values supposed to be the other way around or am I missing something?
> 
> On Tue, May 2, 2017 at 9:13 PM, Denis Magda <dmagda@apache.org <mailto:dmagda@apache.org>>
wrote:
> Matt,
> 
> Cross-posting to the dev list.
> 
> Yes, Ignite switches to the synchronous mode once the buffer is exhausted. However, I
do agree that it would be a right solution to flush multiple entries rather than one in the
synchronous mode. *Igniters*, I was sure we had a ticket for that optimization but unable
to find it.  Does anybody know the ticket name/number?
> 
> To omit the performance degradation you have to tweak the following parameters so that
the write-behind store can keep up with you updates:
> - setWriteBehindFlushThreadCount
> - setWriteBehindFlushFrequency
> - setWriteBehindBatchSize
> - setWriteBehindFlushSize
> 
> Usually it helped all the times to Apache Ignite users.
> 
> > QUESTION 2
> >
> > I've read on the docs that using ATOMIC mode (default mode) is better for performance,
but I'm not getting why. If I'm not wrong using TRANSACTIONAL mode would cause the CacheStore
to reuse connections (not call openConnection(autocommit=true) on each writeAll()).
> >
> > Shouldn't it be better to use transactional mode?
> 
> Transactional mode enables 2 phase commit protocol: https://apacheignite.readme.io/docs/transactions#two-phase-commit-2pc
<https://apacheignite.readme.io/docs/transactions#two-phase-commit-2pc>
> 
> This is why atomic operations are swifter in general.
> 
> —
> Denis
> 
> > On May 2, 2017, at 10:40 AM, Matt <dromitlabs@gmail.com <mailto:dromitlabs@gmail.com>>
wrote:
> >
> > No, only with inserts, I haven't tried removing at this rate yet but it may have
the same problem.
> >
> > I'm debugging Ignite internal code and I may be onto something. The thing is Ignite
has a cacheMaxSize (aka, WriteBehindFlushSize) and cacheCriticalSize (which by default is
cacheMaxSize*1.5). When the cache reaches that size Ignite starts writing elements SYNCHRONOUSLY,
as you can see in [1].
> >
> > I think this makes things worse since only one single value is flushed at a time,
it becomes much slower forcing Ignite to do more synchronous writes.
> >
> > Anyway, I'm still not sure why the cache reaches that level when the database is
clearly able to keep up with the insertions. I'll check if it has to do with the number of
open connections or what.
> >
> > Any insight on this is very welcome!
> >
> > [1] https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/store/GridCacheWriteBehindStore.java#L620
<https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/store/GridCacheWriteBehindStore.java#L620>
> >
> > On Tue, May 2, 2017 at 2:17 PM, Jessie Lin <jessie.jianwei.lin@gmail.com <mailto:jessie.jianwei.lin@gmail.com>>
wrote:
> > I noticed that behavior when any cache.remove operation is involved. I keep putting
stuff in cache seems to be working properly.
> >
> > Do you use remove operation?
> >
> > On Tue, May 2, 2017 at 9:57 AM, Matt <dromitlabs@gmail.com <mailto:dromitlabs@gmail.com>>
wrote:
> > I'm stuck with that. No matter what config I use (flush size, write threads, etc)
this is the behavior I always get. It's as if Ignite internal buffer is full and it's trying
to write and get rid of the oldest (one) element only.
> >
> > Any idea people? What is your CacheStore configuration to avoid this?
> >
> > On Tue, May 2, 2017 at 11:50 AM, Jessie Lin <jessie.jianwei.lin@gmail.com <mailto:jessie.jianwei.lin@gmail.com>>
wrote:
> > Hello Matt, thank you for posting. I've noticed similar behavior.
> >
> > Would be curious to see the response from the engineering team.
> >
> > Best,
> > Jessie
> >
> > On Tue, May 2, 2017 at 1:03 AM, Matt <dromitlabs@gmail.com <mailto:dromitlabs@gmail.com>>
wrote:
> > Hi all,
> >
> > I have two questions for you!
> >
> > QUESTION 1
> >
> > I'm following the example in [1] (a mix between "jdbc transactional" and "jdbc bulk
operations") and I've enabled write behind, however after the first 10k-20k insertions the
performance drops *dramatically*.
> >
> > Based on prints I've added to the CacheStore, I've noticed what Ignite is doing
is this:
> >
> > - writeAll called with 512 elements (Ignites buffers elements, that's good)
> > - openConnection with autocommit=true is called each time inside writeAll (since
session is not stored in atomic mode)
> > - writeAll is called with 512 elements a few dozen times, each time it opens a new
JDBC connection as mentioned above
> > - ...
> > - writeAll called with ONE element (for some reason Ignite stops buffering elements)
> > - writeAll is called with ONE element from here on, each time it opens a new JDBC
connection as mentioned above
> > - ...
> >
> > Things to note:
> >
> > - All config values are the defaults ones except for write through and write behind
which are both enabled.
> > - I'm running this as a server node (only one node on the cluster, the application
itself).
> > - I see the problem even with a big heap (ie, Ignite is not nearly out of memory).
> > - I'm using PostgreSQL for this test (it's fine ingesting around 40k rows per second
on this computer, so that shouldn't be a problem)
> >
> > What is causing Ignite to stop buffering elements after calling writeAll() a few
dozen times?
> >
> > QUESTION 2
> >
> > I've read on the docs that using ATOMIC mode (default mode) is better for performance,
but I'm not getting why. If I'm not wrong using TRANSACTIONAL mode would cause the CacheStore
to reuse connections (not call openConnection(autocommit=true) on each writeAll()).
> >
> > Shouldn't it be better to use transactional mode?
> >
> > Regards,
> > Matt
> >
> > [1] https://apacheignite.readme.io/docs/persistent-store#section-cachestore-example
<https://apacheignite.readme.io/docs/persistent-store#section-cachestore-example>
> >
> >
> >
> >
> 
> 
> 
> 


Mime
View raw message