Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4BFF6200C70 for ; Thu, 4 May 2017 21:43:58 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 4AADA160B9B; Thu, 4 May 2017 19:43:58 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E954B160BB0 for ; Thu, 4 May 2017 21:43:56 +0200 (CEST) Received: (qmail 22455 invoked by uid 500); 4 May 2017 19:43:56 -0000 Mailing-List: contact user-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ignite.apache.org Delivered-To: mailing list user@ignite.apache.org Received: (qmail 22438 invoked by uid 99); 4 May 2017 19:43:56 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 May 2017 19:43:56 +0000 Received: from [192.168.75.184] (c-73-222-138-29.hsd1.ca.comcast.net [73.222.138.29]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id AD12B1A0015; Thu, 4 May 2017 19:43:55 +0000 (UTC) From: Denis Magda Content-Type: multipart/alternative; boundary="Apple-Mail=_50BF0A95-6322-489A-B15F-3FBC1FF0ED24" Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: CacheStore's Performance Drops Dramatically - Why? Date: Thu, 4 May 2017 12:43:54 -0700 References: <96A6DFC1-F9C9-4EC4-98F1-51403E53AAB6@apache.org> To: user@ignite.apache.org, dev@ignite.apache.org In-Reply-To: Message-Id: <8F54FFB0-0E0F-4219-8502-15FBD9DCF4D3@apache.org> X-Mailer: Apple Mail (2.3273) archived-at: Thu, 04 May 2017 19:43:58 -0000 --Apple-Mail=_50BF0A95-6322-489A-B15F-3FBC1FF0ED24 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Looks like the naming of =E2=80=98getWriteBehindFlushSize=E2=80=99 = method is totally wrong. It confuses so many people. However, if we = refer to the documentation of this method or look into the source code = we will find out that it sets the maximum size of the write-behind = queue/buffer on a single node. Once this size is reached data will be = flushed to a storage in the sync mode. So, you need to set the flush size (maximum queue/buffer size) to a = bigger value if you can=E2=80=99t keep up with updates and always switch = to the sync mode. In any case, I=E2=80=99ve created a ticket to address both issues = discussed here: https://issues.apache.org/jira/browse/IGNITE-5173 = Thanks for your patience. =E2=80=94 Denis > On May 3, 2017, at 10:10 AM, Jessie Lin = wrote: >=20 > I thought flushsize could be set as several times higher than the = batch size is that in a cluster, data nodes would flush in parallel. For = example there's a cluster with 10 nodes, and flushSize is 10240, thread = count =3D 2, batch size =3D 512. Then each node would flush out in 2 = thread, and each thread flushes out in batch of 512.=20 >=20 > Could someone confirms or clarify the understanding? Thank you! >=20 > On Wed, May 3, 2017 at 12:16 AM, Matt > wrote: > In fact, I don't see why you would need both batchSize and flushSize. = If I got it right, only the min of them would be used by Ignite to know = when to flush, why do we have both in the first place? >=20 > In case they're both necessary for a reason I'm not seeing, I still = wonder if the default values should be batchSize > flushSize as I think = or not. >=20 > On Wed, May 3, 2017 at 3:26 AM, Matt > wrote: > I'm writing to confirm I managed to fix my problem by fine tuning the = config params for the write behind cache until the performance was fine. = I still see single element inserts from time to time, but just a few of = them every now and then not like before. You should definitely avoid = synchronous single elements insertions, I hope that changes in future = versions. >=20 > Regarding writeBehindBatchSize and writeBehindFlushSize, I don't see = the point of setting both values when batchSize < flushSize (default = values are 512 and 10240 respectively). If I'm not wrong, the cache is = flushed whenever the its size is equal to min(batchSize, flushSize). = Since batchSize is less than flushSize, flushSize is never really used = and the size of the flush is controlled by the size of the cache itself = only. >=20 > That is how it works by default, on the other hand if we swap their = values (ie, batchSize=3D10240 and flushSize=3D512) the behavior would be = the same (Ignite would call writeAll() with 512 elements each time), but = the number of elements flushed would be controlled by the correct = variable (ie, flushSize). >=20 > Were the default values supposed to be the other way around or am I = missing something? >=20 > On Tue, May 2, 2017 at 9:13 PM, Denis Magda > wrote: > Matt, >=20 > Cross-posting to the dev list. >=20 > Yes, Ignite switches to the synchronous mode once the buffer is = exhausted. However, I do agree that it would be a right solution to = flush multiple entries rather than one in the synchronous mode. = *Igniters*, I was sure we had a ticket for that optimization but unable = to find it. Does anybody know the ticket name/number? >=20 > To omit the performance degradation you have to tweak the following = parameters so that the write-behind store can keep up with you updates: > - setWriteBehindFlushThreadCount > - setWriteBehindFlushFrequency > - setWriteBehindBatchSize > - setWriteBehindFlushSize >=20 > Usually it helped all the times to Apache Ignite users. >=20 > > QUESTION 2 > > > > I've read on the docs that using ATOMIC mode (default mode) is = better for performance, but I'm not getting why. If I'm not wrong using = TRANSACTIONAL mode would cause the CacheStore to reuse connections (not = call openConnection(autocommit=3Dtrue) on each writeAll()). > > > > Shouldn't it be better to use transactional mode? >=20 > Transactional mode enables 2 phase commit protocol: = https://apacheignite.readme.io/docs/transactions#two-phase-commit-2pc = >=20 > This is why atomic operations are swifter in general. >=20 > =E2=80=94 > Denis >=20 > > On May 2, 2017, at 10:40 AM, Matt > wrote: > > > > No, only with inserts, I haven't tried removing at this rate yet but = it may have the same problem. > > > > I'm debugging Ignite internal code and I may be onto something. The = thing is Ignite has a cacheMaxSize (aka, WriteBehindFlushSize) and = cacheCriticalSize (which by default is cacheMaxSize*1.5). When the cache = reaches that size Ignite starts writing elements SYNCHRONOUSLY, as you = can see in [1]. > > > > I think this makes things worse since only one single value is = flushed at a time, it becomes much slower forcing Ignite to do more = synchronous writes. > > > > Anyway, I'm still not sure why the cache reaches that level when the = database is clearly able to keep up with the insertions. I'll check if = it has to do with the number of open connections or what. > > > > Any insight on this is very welcome! > > > > [1] = https://github.com/apache/ignite/blob/master/modules/core/src/main/java/or= g/apache/ignite/internal/processors/cache/store/GridCacheWriteBehindStore.= java#L620 = > > > > On Tue, May 2, 2017 at 2:17 PM, Jessie Lin = > = wrote: > > I noticed that behavior when any cache.remove operation is involved. = I keep putting stuff in cache seems to be working properly. > > > > Do you use remove operation? > > > > On Tue, May 2, 2017 at 9:57 AM, Matt > wrote: > > I'm stuck with that. No matter what config I use (flush size, write = threads, etc) this is the behavior I always get. It's as if Ignite = internal buffer is full and it's trying to write and get rid of the = oldest (one) element only. > > > > Any idea people? What is your CacheStore configuration to avoid = this? > > > > On Tue, May 2, 2017 at 11:50 AM, Jessie Lin = > = wrote: > > Hello Matt, thank you for posting. I've noticed similar behavior. > > > > Would be curious to see the response from the engineering team. > > > > Best, > > Jessie > > > > On Tue, May 2, 2017 at 1:03 AM, Matt > wrote: > > Hi all, > > > > I have two questions for you! > > > > QUESTION 1 > > > > I'm following the example in [1] (a mix between "jdbc transactional" = and "jdbc bulk operations") and I've enabled write behind, however after = the first 10k-20k insertions the performance drops *dramatically*. > > > > Based on prints I've added to the CacheStore, I've noticed what = Ignite is doing is this: > > > > - writeAll called with 512 elements (Ignites buffers elements, = that's good) > > - openConnection with autocommit=3Dtrue is called each time inside = writeAll (since session is not stored in atomic mode) > > - writeAll is called with 512 elements a few dozen times, each time = it opens a new JDBC connection as mentioned above > > - ... > > - writeAll called with ONE element (for some reason Ignite stops = buffering elements) > > - writeAll is called with ONE element from here on, each time it = opens a new JDBC connection as mentioned above > > - ... > > > > Things to note: > > > > - All config values are the defaults ones except for write through = and write behind which are both enabled. > > - I'm running this as a server node (only one node on the cluster, = the application itself). > > - I see the problem even with a big heap (ie, Ignite is not nearly = out of memory). > > - I'm using PostgreSQL for this test (it's fine ingesting around 40k = rows per second on this computer, so that shouldn't be a problem) > > > > What is causing Ignite to stop buffering elements after calling = writeAll() a few dozen times? > > > > QUESTION 2 > > > > I've read on the docs that using ATOMIC mode (default mode) is = better for performance, but I'm not getting why. If I'm not wrong using = TRANSACTIONAL mode would cause the CacheStore to reuse connections (not = call openConnection(autocommit=3Dtrue) on each writeAll()). > > > > Shouldn't it be better to use transactional mode? > > > > Regards, > > Matt > > > > [1] = https://apacheignite.readme.io/docs/persistent-store#section-cachestore-ex= ample = > > > > > > > > >=20 >=20 >=20 >=20 --Apple-Mail=_50BF0A95-6322-489A-B15F-3FBC1FF0ED24 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Looks like the naming of =E2=80=98getWriteBehindFlushSize=E2=80= =99 method is totally wrong. It confuses so many people. However, if we = refer to the documentation of this method or look into the source code = we will find out that it sets the maximum size of the write-behind = queue/buffer on a single node. Once this size is reached data will be = flushed to a storage in the sync mode.

So, you need to set the flush size = (maximum queue/buffer size) to a bigger value if you can=E2=80=99t keep = up with updates and always switch to the sync mode.

In any case, I=E2=80=99ve = created a ticket to address both issues discussed here:

Thanks for your = patience.

=E2=80=94
Denis

On May 3, 2017, at 10:10 AM, Jessie Lin <jessie.jianwei.lin@gmail.com> wrote:

I thought flushsize could be set as several times higher than = the batch size is that in a cluster, data nodes would flush in parallel. = For example there's a cluster with 10 nodes, and flushSize is 10240, = thread count =3D 2, batch size =3D 512. Then each node would flush out = in 2 thread, and each thread flushes out in batch of 512. 

Could someone confirms = or clarify the understanding? Thank you!

On Wed, = May 3, 2017 at 12:16 AM, Matt <dromitlabs@gmail.com> wrote:
In fact, I don't see why you would need both batchSize and flushSize. If I got it right, only = the min of them would be used by Ignite to know when to flush, why do we = have both in the first place?

In case they're = both necessary for a reason I'm not seeing, I still wonder if the = default values should be batchSize > flushSize as I think or = not.

On Wed, = May 3, 2017 at 3:26 AM, Matt <dromitlabs@gmail.com> wrote:
I'm writing to confirm I managed to fix my problem by fine = tuning the config params for the write behind cache until the = performance was fine. I still see single element inserts from time to = time, but just a few of them every now and then not like before. You = should definitely avoid synchronous single elements insertions, I hope = that changes in future versions.

Regarding writeBehindBatchSize and writeBehindFlushSize, I don't see the point of = setting both values when batchSize < flushSize (default values are = 512 and 10240 respectively). If I'm not wrong, the cache is flushed = whenever the its size is equal to min(batchSize, flushSize). Since = batchSize is less than flushSize, flushSize is never really used and the = size of the flush is controlled by the size of the cache itself = only.

That is how it works by default, = on the other hand if = we swap their values (ie, batchSize=3D10240 and flushSize=3D512) the behavior would be = the same (Ignite would call writeAll() with 512 elements each time), but = the number of elements flushed would be controlled by the correct = variable (ie, flushSize).

Were the default = values supposed to be the other way around or am I missing = something?

On Tue, May 2, 2017 at 9:13 PM, = Denis Magda <dmagda@apache.org> wrote:
Matt,

Cross-posting to the dev list.

Yes, Ignite switches to the synchronous mode once the buffer is = exhausted. However, I do agree that it would be a right solution to = flush multiple entries rather than one in the synchronous mode. = *Igniters*, I was sure we had a ticket for that optimization but unable = to find it.  Does anybody know the ticket name/number?
=
To omit the performance degradation you have to tweak the following = parameters so that the write-behind store can keep up with you = updates:
- setWriteBehindFlushThreadCount
- setWriteBehindFlushFrequency
- setWriteBehindBatchSize
- setWriteBehindFlushSize

Usually it helped all the times to Apache Ignite users.

> QUESTION 2
>
> I've read on the docs that using ATOMIC mode (default mode) is = better for performance, but I'm not getting why. If I'm not wrong using = TRANSACTIONAL mode would cause the CacheStore to reuse connections (not = call openConnection(autocommit=3Dtrue) on each = writeAll()).
>
> Shouldn't it be better to use transactional mode?

Transactional mode enables 2 phase commit protocol: https://apacheignite.readme.io/docs/transactions#two-phase-commit-2pc

This is why atomic operations are swifter in general.

=E2=80=94
Denis

> On May 2, 2017, at 10:40 AM, Matt <dromitlabs@gmail.com> wrote:
>
> No, only with inserts, I haven't tried removing at this rate yet = but it may have the same problem.
>
> I'm debugging Ignite internal code and I may be onto something. The = thing is Ignite has a cacheMaxSize (aka, WriteBehindFlushSize) and = cacheCriticalSize (which by default is cacheMaxSize*1.5). When the cache = reaches that size Ignite starts writing elements SYNCHRONOUSLY, as you = can see in [1].
>
> I think this makes things worse since only one single value is = flushed at a time, it becomes much slower forcing Ignite to do more = synchronous writes.
>
> Anyway, I'm still not sure why the cache reaches that level when = the database is clearly able to keep up with the insertions. I'll check = if it has to do with the number of open connections or what.
>
> Any insight on this is very welcome!
>
> [1] https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/store/GridCacheWriteBehindStore.java#L620
>
> On Tue, May 2, 2017 at 2:17 PM, Jessie Lin <jessie.jianwei.lin@gmail.com> wrote:
> I noticed that behavior when any cache.remove operation is = involved. I keep putting stuff in cache seems to be working properly.
>
> Do you use remove operation?
>
> On Tue, May 2, 2017 at 9:57 AM, Matt <dromitlabs@gmail.com> wrote:
> I'm stuck with that. No matter what config I use (flush size, write = threads, etc) this is the behavior I always get. It's as if Ignite = internal buffer is full and it's trying to write and get rid of the = oldest (one) element only.
>
> Any idea people? What is your CacheStore configuration to avoid = this?
>
> On Tue, May 2, 2017 at 11:50 AM, Jessie Lin <jessie.jianwei.lin@gmail.com> wrote:
> Hello Matt, thank you for posting. I've noticed similar = behavior.
>
> Would be curious to see the response from the engineering team.
>
> Best,
> Jessie
>
> On Tue, May 2, 2017 at 1:03 AM, Matt <dromitlabs@gmail.com> wrote:
> Hi all,
>
> I have two questions for you!
>
> QUESTION 1
>
> I'm following the example in [1] (a mix between "jdbc = transactional" and "jdbc bulk operations") and I've enabled write = behind, however after the first 10k-20k insertions the performance drops = *dramatically*.
>
> Based on prints I've added to the CacheStore, I've noticed what = Ignite is doing is this:
>
> - writeAll called with 512 elements (Ignites buffers elements, = that's good)
> - openConnection with autocommit=3Dtrue is called each time inside = writeAll (since session is not stored in atomic mode)
> - writeAll is called with 512 elements a few dozen times, each time = it opens a new JDBC connection as mentioned above
> - ...
> - writeAll called with ONE element (for some reason Ignite stops = buffering elements)
> - writeAll is called with ONE element from here on, each time it = opens a new JDBC connection as mentioned above
> - ...
>
> Things to note:
>
> - All config values are the defaults ones except for write through = and write behind which are both enabled.
> - I'm running this as a server node (only one node on the cluster, = the application itself).
> - I see the problem even with a big heap (ie, Ignite is not nearly = out of memory).
> - I'm using PostgreSQL for this test (it's fine ingesting around = 40k rows per second on this computer, so that shouldn't be a problem)
>
> What is causing Ignite to stop buffering elements after calling = writeAll() a few dozen times?
>
> QUESTION 2
>
> I've read on the docs that using ATOMIC mode (default mode) is = better for performance, but I'm not getting why. If I'm not wrong using = TRANSACTIONAL mode would cause the CacheStore to reuse connections (not = call openConnection(autocommit=3Dtrue) on each = writeAll()).
>
> Shouldn't it be better to use transactional mode?
>
> Regards,
> Matt
>
> [1] https://apacheignite.readme.io/docs/persistent-store#section-cachestore-example
>
>
>
>





= --Apple-Mail=_50BF0A95-6322-489A-B15F-3FBC1FF0ED24--