From user-return-26573-archive-asf-public=cust-asf.ponee.io@ignite.apache.org Tue Jul 9 16:16:38 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 5794518062B for ; Tue, 9 Jul 2019 18:16:38 +0200 (CEST) Received: (qmail 35345 invoked by uid 500); 9 Jul 2019 16:16:36 -0000 Mailing-List: contact user-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ignite.apache.org Delivered-To: mailing list user@ignite.apache.org Received: (qmail 35334 invoked by uid 99); 9 Jul 2019 16:16:36 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Jul 2019 16:16:36 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 47E7B180C8A for ; Tue, 9 Jul 2019 16:16:36 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.344 X-Spam-Level: *** X-Spam-Status: No, score=3.344 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, PDS_NO_HELO_DNS=1.294, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id HU_-VDrXukbZ for ; Tue, 9 Jul 2019 16:16:34 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.208.181; helo=mail-lj1-f181.google.com; envelope-from=breath1988@gmail.com; receiver= Received: from mail-lj1-f181.google.com (mail-lj1-f181.google.com [209.85.208.181]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 7A76DBC52B for ; Tue, 9 Jul 2019 16:16:34 +0000 (UTC) Received: by mail-lj1-f181.google.com with SMTP id i21so20170248ljj.3 for ; Tue, 09 Jul 2019 09:16:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:mime-version:subject:date:references:to:in-reply-to:message-id; bh=aJZ2kwK4zJcrY74eR9BJ1jiec3aFwbujajcwgn65rDk=; b=P2ONDW0e7H9BqtavNNqGFTJnPgzolQlDakO8IbgDry8Vr6fvkXy7ffCqxUAV7a1Zrc U0S4LBDJBuIQYeMBE6Ow8++cYDqSVy7vu4uCjSyNbVnRO5tVPfLXYI3srxCxgORXJCSo loUB+5kWSbtx+2ZfoxEp6F+EzBUlSsVQhxL91HQXvm/eVgvzQM8ExqzokpOYhONgbuFf jOX0EsrmS/OJJASFIqIA4KKsPGTH0mGAMUAVXs4TnQXTEkYnIbh+l73Ci+pl2WjzaDlM lms8Ozg9/DwFuSqdcxDdQrcGWCM+7w8JKgxbFzoWX7LNpi+9SbvZkpudUqCuvfk17f3Y eJ1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:subject:date:references:to :in-reply-to:message-id; bh=aJZ2kwK4zJcrY74eR9BJ1jiec3aFwbujajcwgn65rDk=; b=MBk1V77L23Xd/GeKaL70A3h9HCbPzU9Q1KXGo+ghcGvYBiCH4S6A5tcV4Sb13z/JjE yUU5VgE0u6o7npWE5afLYZCr9vsOoGmR3/zYs5X3hK4dNvwPxXriZogfxezEA2Qr2mKt iQJKgMv7vvOEM8GGOQeIRJo7zULshhg5YMhoOMMTiXHzbYOEgU7n7dij6ghwdWGR6wXU wt6c3TCkOcthW3Ezg6MHtDiGHiGs1tPQ68osFPbBg5MT2AXUdK85C/ZC1Df6V731gYL1 vWFtb9cSXH7ThT4M1ne1XkhOWUa1IPH+d1tkh+D5aClQl9RJBA9eGqhUWOiQcHZvi4UD dJNw== X-Gm-Message-State: APjAAAXD2D+43gDl68rZcd1Cilm3zV1+wLg5NDp7yvwPoxMBqK2JZllY IU86EjblLlSgG6IhBt1fG4HFr9k1x00= X-Google-Smtp-Source: APXvYqwoFEpLL3b4V8uSPfAJdYEPVod8leAhofdqjkKR+SCuofd9sjShWyiKLBrwFQmwLVZDZplLMA== X-Received: by 2002:a2e:8650:: with SMTP id i16mr14427420ljj.178.1562688992846; Tue, 09 Jul 2019 09:16:32 -0700 (PDT) Received: from [172.20.10.2] ([176.59.36.245]) by smtp.gmail.com with ESMTPSA id e13sm4469114ljg.102.2019.07.09.09.16.31 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 09 Jul 2019 09:16:32 -0700 (PDT) From: Evgeny Pryakhin Content-Type: multipart/alternative; boundary="Apple-Mail=_841E6396-0A00-44CF-AC89-72B916658D65" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: Slow reads in one cache during batch write on different cache in different region Date: Tue, 9 Jul 2019 19:16:29 +0300 References: <5B70656B-FA2F-4A93-9814-432BF345C33B@gmail.com> To: user@ignite.apache.org In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3445.104.11) --Apple-Mail=_841E6396-0A00-44CF-AC89-72B916658D65 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Thank you for your reply. I will check the thread pools. According to = thread pools description in ignite docs problem may be in Striped Pool. In my case I have a lot of writes and a small number of reads. And in case while writes and reads processing through one queue i will = have this problem all the time. If the problem in striped pool, does there any way to split processing = of reads and writes into separate thread pools? =E2=80=94 Evgeny Pryakhin > 9 =D0=B8=D1=8E=D0=BB=D1=8F 2019 =D0=B3., =D0=B2 18:13, Ilya Kasnacheev = =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0= =B0): >=20 > Hello! >=20 > I think you should collect thread dumps from all nodes to locate the = bottleneck. Then, maybe you need to adjust thread pool sizes. >=20 > My idea here is that some thread pool (stripe, probably) gets full = with persistent cache writing operations, and reads have to wait in = queue. >=20 > Regards, > --=20 > Ilya Kasnacheev >=20 >=20 > =D0=B2=D1=82, 9 =D0=B8=D1=8E=D0=BB. 2019 =D0=B3. =D0=B2 18:08, Evgeny = Pryakhin >: > Hello. I need some help or advise on my problem.=20 >=20 > Preface:=20 > - Ignite 2.5 (I have no options about upgrade to newer version)=20 > - cluster with 8 servers, 4 CPU and 64GB RAM each, HDD (not SSD). = Operating system was tuned according to performance guide. > - two memory regions configured: one in-memory only (500MB) and one = with persistence enabled (about 40GB memory).=20 > - one cache in in-memory region (about 300k records), backups - 3. = Write mode: PRIMARY_SINC.=20 > - one cache in region with persistence (about 50M records), backups 3. = Write mode: PRIMARY_SINC.=20 > - Ignite Thin Client as a driver.=20 >=20 > Scenario:=20 > - I have batch writes on first in memory cache - about 500/sec. = Continuously.=20 > - I have a lot of reads on first in-memory cache - about 3k/sec. = Continuously.=20 > - I have a lot of batch writes on second persistent cache. Batch size = is about 1k records. Continuously.=20 >=20 > The Problem:=20 > - when I have batch writes to the second (persistent) cache disabled = reads from first cache works well with small latency - <1ms.=20 > - when batch writes to persistent cache is turned on - reads from the = first cache become very slow - about 200-300ms.=20 >=20 > I have no ideas how even to start investigation on this problem. May = be I can check some metrics of cluster or system metrics on harware = servers to find the right way to solve my problem. Do you have ant ideas = about this? --Apple-Mail=_841E6396-0A00-44CF-AC89-72B916658D65 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Thank= you for your reply. I will check the thread pools. According to thread = pools description in ignite docs problem may be in Striped Pool.

In my case I have a lot = of writes and a small number of reads.
And in case = while writes and reads processing through one queue i will have this = problem all the time.

If the problem in striped  pool, does there any way to = split processing of reads and writes into separate thread = pools?

=E2=80=94
Evgeny Pryakhin


9 = =D0=B8=D1=8E=D0=BB=D1=8F 2019 =D0=B3., =D0=B2 18:13, Ilya Kasnacheev = <ilya.kasnacheev@gmail.com> = =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0):

Hello!

I think you should collect thread dumps = from all nodes to locate the bottleneck. Then, maybe you need to adjust = thread pool sizes.

My idea here is that some thread pool (stripe, probably) gets = full with persistent cache writing operations, and reads have to wait in = queue.

Regards,
--
Ilya Kasnacheev


=D0=B2=D1=82, 9 =D0=B8=D1=8E=D0=BB. 2019 =D0=B3. =D0=B2= 18:08, Evgeny Pryakhin <breath1988@gmail.com>:
Hello. I need some help or advise on = my problem. 

Preface: 
- Ignite 2.5 (I have no options = about upgrade to newer version) 
- cluster with 8 servers, 4 CPU and = 64GB RAM each, HDD (not SSD). Operating system was tuned according to = performance guide.
- two memory regions configured: one = in-memory only (500MB) and one with persistence enabled (about 40GB = memory). 
- one cache in in-memory region = (about 300k records), backups - 3. Write mode: = PRIMARY_SINC. 
- one cache in region with = persistence (about 50M records), backups 3. Write mode: = PRIMARY_SINC. 
- Ignite Thin Client as a = driver. 

Scenario: 
- I have batch writes on first in = memory cache - about 500/sec. Continuously. 
- I have a lot of reads on first = in-memory cache - about 3k/sec. Continuously. 
- I have a lot of batch writes on = second persistent cache. Batch size is about 1k records. = Continuously. 

The Problem: 
- when I have batch writes to the = second (persistent) cache disabled reads from first cache works well = with small latency - <1ms. 
- when batch writes to persistent = cache is turned on - reads from the first cache become very slow - about = 200-300ms. 

I have no ideas how even to start = investigation on this problem. May be I can check some metrics of = cluster or system metrics on harware servers to find the right way to = solve my problem. Do you have ant ideas about = this?

= --Apple-Mail=_841E6396-0A00-44CF-AC89-72B916658D65--