From user-return-17431-archive-asf-public=cust-asf.ponee.io@ignite.apache.org Wed Jan 24 02:41:51 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 2E586180676 for ; Wed, 24 Jan 2018 02:41:51 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 1E254160C4D; Wed, 24 Jan 2018 01:41:51 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EB6BF160C3A for ; Wed, 24 Jan 2018 02:41:49 +0100 (CET) Received: (qmail 72718 invoked by uid 500); 24 Jan 2018 01:41:48 -0000 Mailing-List: contact user-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ignite.apache.org Delivered-To: mailing list user@ignite.apache.org Received: (qmail 72708 invoked by uid 99); 24 Jan 2018 01:41:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Jan 2018 01:41:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 3C455C3329 for ; Wed, 24 Jan 2018 01:41:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.98 X-Spam-Level: * X-Spam-Status: No, score=1.98 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=trimble-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id OepOCkTyaHfX for ; Wed, 24 Jan 2018 01:41:46 +0000 (UTC) Received: from mail-wm0-f43.google.com (mail-wm0-f43.google.com [74.125.82.43]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 9CD225FC75 for ; Wed, 24 Jan 2018 01:41:45 +0000 (UTC) Received: by mail-wm0-f43.google.com with SMTP id g1so5377326wmg.2 for ; Tue, 23 Jan 2018 17:41:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=trimble-com.20150623.gappssmtp.com; s=20150623; h=from:references:in-reply-to:mime-version:thread-index:date :message-id:subject:to; bh=q8J2YF0ev25IAJ8BwFmGsH4sKa9iCGqlX6ZUOe0c5qs=; b=Nz++0xLpLhGw//QJSqkc/V1/YI3PCfvfX4v95wEHlM5TawQrUAPTuoCvjZpfO3HFhU 9j4VcHzHQ/7SqUp4xaCNlI78hs1OhGPRIaJPk1tpuNaygMGcr/jZPtyBpsovNQqQOhE9 AL38/3geNklfq8sMa7mpcWE79tcTWujz4mfBh9cqT9T7chrdfIQ6r8T/iu+hAWR9vVKE 79C9e7AUgf1c7QekyCaUkghRLfhqO2dZN+6GjQhryd+aPse3uFOwOoeMieIY/9UDakhl y0rYFl6qwdqLQgMtbTm2O/gBAKO+6DRldfT9vncdOfeua9YoKEvD7Q24N+UwrOb2XVtY lqgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:references:in-reply-to:mime-version :thread-index:date:message-id:subject:to; bh=q8J2YF0ev25IAJ8BwFmGsH4sKa9iCGqlX6ZUOe0c5qs=; b=GiPjOwIOHUALXh5Vj++vOiPF2m8CksCZlrf1XLjUVF804hqU8pptL6FB5fizd/QpSL QomNmyErWFlKYfgnmM5Z610RUOs0ZTFFuAqnL6/tCygDSxX4kPx7Lveg7j3AdvxQiiRV wg3DejQ3gH3Y3WSzrqt6LbayqYt2fbi1HXKLMtK690lyna3PCGPsfjeH6e55F15gOFZR jXl2Z8fLxmEaXEWA3FpuzhUpevavfv339g/93aKKO+zf709TpLn0TrLw7we1UQN5ms+g zqpfzWim//Bu4IChKZy8211iPDg8U96skd19Hy45cLwiNk4WJHqlQpCFufRucVDdDpL1 XDLQ== X-Gm-Message-State: AKwxytdMZ9HOZagGVf78ICb2vbVGqe31uyTBSBZ5acDrP9SKMsQtcwKe YfIBt27PkAqfNRXLSoPWbcsqlgN5qjMGzUmxDql2Yqxc X-Google-Smtp-Source: AH8x225NhpdyLglCI75aGLIzCArbIpJ9IVPn2USamoHvPk3kjooUQ+JwGUrR+rhf3k68zBw9ECQ5FJVMajrVCiSzLfY= X-Received: by 10.80.218.201 with SMTP id s9mr22070519edj.273.1516758104147; Tue, 23 Jan 2018 17:41:44 -0800 (PST) From: Raymond Wilson References: 5490a3e03bbcecad91a434573fc6f3d1@mail.gmail.com In-Reply-To: 5490a3e03bbcecad91a434573fc6f3d1@mail.gmail.com MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AQDi1MsZL8wymc/4HsoIgjEogEYMbAE1YaHgpQp99LCAT5RAYA== Date: Wed, 24 Jan 2018 14:41:35 +1300 Message-ID: Subject: RE: Obtaining metadata about items in the cache To: user@ignite.apache.org, dev@ignite.apache.org Content-Type: multipart/alternative; boundary="089e08221c6802428705637bc417" --089e08221c6802428705637bc417 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I=E2=80=99d like to carry this conversation further, cross-posting to dev l= ist: I now have possible production use cases for accessing cache key [metadata]. As an example, suppose I want to scan all keys from a cache that may contain large amounts of data and perform some operation on a few of them, based on the value of the key itself. In this use-case the IO bandwidth required for keys & data might be as much as a 1000 times the bandwidth required for keys alone, even when considering request parallelization and co-location. I imagine that Ignite can internally scan cache keys as a part of its internal query operations. Is that correct? If so, would it be difficult to expose this kind of functionality in the Ignite API? Thanks, Raymond. *From:* Raymond Wilson [mailto:raymond_wilson@trimble.com] *Sent:* Monday, December 4, 2017 11:26 PM *To:* 'user@ignite.apache.org' *Subject:* RE: Obtaining metadata about items in the cache Thanks Alexey. This would certainly reduce the IO, but does still require all the data to be read. My use case is not really a production one: I want to iterate all items in the cache to determine if the page size for persistency was suitable. Reading all the data is not too painful, but a meta data scan would be much faster, especially if spread across the cluster in your example below. Raymond. *From:* Alexey Kukushkin [mailto:kukushkinalexey@gmail.com ] *Sent:* Monday, December 4, 2017 11:10 PM *To:* user@ignite.apache.org *Subject:* Re: Obtaining metadata about items in the cache Hi Raymond, I do not think Ignite supports iterating other metadata but you could minimise IO by: - collocated processing (analyse entries locally without sending them over the network) - working with binary object representation directly (without serialisation/deserialisation) You could send you analysis job to each partition and then execute a local scan query that would work with binary objects. In the below code I highlighted the affinityCall, withKeepBinary and setLocal methods you need to use to achieve the above optimizations: IgniteCompute compute =3D ignite.compute(ignite.cluster().forServers()); for (int i =3D 0; i < ignite.affinity("CacheName").partitions(); ++i) { compute.*affinityRun*(Collections.singletonList("CacheName"), i, () -> = { IgniteCache cache =3D ignite.cache("CacheName").*withKeepBinary*(); IgniteQuery<...> qry =3D new ScanQuery<>( (k, v) -> { ... }; qry.*setLocal*(true); QueryCursor cur =3D cache.query( ); ... }); } On Mon, Dec 4, 2017 at 1:33 AM, Raymond Wilson wrote: Hi, I=E2=80=99d like to be able to scan all the items in a cache where all I am interested in is the cache key and other metadata about the cached item (such as its size). I can do this now by running a cache query that simple reads out all the cache items, but this is a lot of IO when I don=E2=80=99t care about the co= ntent of the items themselves. Does anyone here do this? Thanks, Raymond. --=20 Best regards, Alexey --089e08221c6802428705637bc417 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

I=E2=80=99d like to carry thi= s conversation further, cross-posting to dev list:

=C2=A0

I now have possible production use cases= for accessing cache key [metadata].

=C2=A0

<= p class=3D"MsoNormal">As an example, suppose I want to scan all keys from a= cache that may contain large amounts of data and perform some operation on= a few of them, based on the value of the key itself.

=C2=A0

In this use-case the IO bandwidth req= uired for keys & data might be as much as a 1000 times the bandwidth re= quired for keys alone, even when considering request parallelization and co= -location.

=C2=A0

I ima= gine that Ignite can internally scan cache keys as a part of its internal q= uery operations. Is that correct? If so, would it be difficult to expose th= is kind of functionality in the Ignite API?

=C2= =A0

Thanks,

Raymond.

=C2=A0

F= rom: Raymond Wilson [mailto:raymond_wilson@trimble.com]
Sent: Monday, December 4, 201= 7 11:26 PM
To: 'use= r@ignite.apache.org' <= user@ignite.apache.org>
Subject: RE: Obtaining metadata ab= out items in the cache

=C2=A0

Thanks Alexey.

=C2=A0

This would certainly reduce the IO, but does still requir= e all the data to be read.

=C2=A0

My use case is not really a production one: I want to iterate al= l items in the cache to determine if the page size for persistency was suit= able. Reading all the data is not too painful, but a meta data scan would b= e much faster, especially if spread across the cluster in your example belo= w.

=C2=A0

Raymond.

<= p class=3D"MsoNormal">=C2=A0

From: Alexey = Kukushkin [mailto:kukushkinale= xey@gmail.com]
Sent: Monday, December 4, 2017 11:10 PM
To: user@ignite.apache.org
Subject: Re: Obtaining metadata about items in the cache

=C2=A0

Hi Raymond,

<= div>

=C2=A0

I do= not think Ignite supports iterating other metadata but you could minimise = IO by:

  • collocated processing (analyse entries locally without sendin= g them over the network)
  • working with binary object representation directly (without serialisati= on/deserialisation)=C2=A0

You could se= nd you analysis job to each partition and then execute a local scan query t= hat would work with binary objects. In the below code I highlighted the aff= inityCall, withKeepBinary and setLocal methods you need to use to achieve t= he above optimizations:

=C2=A0

IgniteCompute compute =3D ignite.comput= e(ignite.cluster().forServers());

for = (int i =3D 0; i < ignite.affinity("CacheName").partitions(); += +i) {

=C2=A0

=C2=A0 =C2=A0 compute.affinityRun(Collections.sin= gletonList("CacheName"), i, () -> {

=C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0

=C2=A0 =C2=A0 =C2=A0 =C2=A0 IgniteCache<BinaryObject, BinaryObjec= t> cache =3D ignite.cache("CacheName").withKeepBinary();

=C2=A0 =C2=A0 =C2=A0 =C2=A0

=C2=A0 =C2=A0 =C2=A0 =C2=A0 IgniteQuery= <...> qry =3D new ScanQuery<>( (k, v) -> { ... };

<= div>

=C2=A0 =C2=A0 =C2=A0 =C2=A0 qry.setLocal(true);

=C2=A0

}

=C2=A0

=C2=A0

=C2=A0

=C2=A0

=C2=A0

=

On Mon, Dec 4, 2017 at 1:33 AM, Raymond Wilson = <raymond= _wilson@trimble.com> wrote:

Hi,<= /p>

=C2=A0

I=E2=80=99d like to be able to scan all the = items in a cache where all I am interested in is the cache key and other me= tadata about the cached item (such as its size).

=C2=A0

I can do this now by running a cache query that simple reads out all t= he cache items, but this is a lot of IO when I don=E2=80=99t care about the= content of the items themselves.

=C2=A0

Does anyon= e here do this?

=C2=A0

Thanks,

Raymond.=

=C2=A0



=C2=A0

--

Best regards,

Alexey

--089e08221c6802428705637bc417--