Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id EA336200C49 for ; Fri, 17 Mar 2017 10:09:40 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id E8A74160B80; Fri, 17 Mar 2017 09:09:40 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E291F160B6D for ; Fri, 17 Mar 2017 10:09:39 +0100 (CET) Received: (qmail 2654 invoked by uid 500); 17 Mar 2017 09:09:39 -0000 Mailing-List: contact dev-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list dev@ignite.apache.org Received: (qmail 2642 invoked by uid 99); 17 Mar 2017 09:09:38 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Mar 2017 09:09:38 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 5E684C0DFB for ; Fri, 17 Mar 2017 09:09:38 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id InL3Xh_NiJcJ for ; Fri, 17 Mar 2017 09:09:36 +0000 (UTC) Received: from mail-wm0-f51.google.com (mail-wm0-f51.google.com [74.125.82.51]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 60F9C5F46F for ; Fri, 17 Mar 2017 09:09:35 +0000 (UTC) Received: by mail-wm0-f51.google.com with SMTP id u132so10401956wmg.0 for ; Fri, 17 Mar 2017 02:09:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=XS+com/RSQZy2+JQtdev4NEQ6e1SXR91qf6ZkTUSCfk=; b=cBSTx82+OMkN85SH/UhtnOLRLebWUXUPWc9mXhEnmd7kg4QA7MC5WpIZgflbL2mq8G HHpymMkhy7sts/bHr4Ua6pVq0+BhhMctDzYt8MI+49qRKl5PLDh7jFQmZ6WEB72qTJzD 0DtBn9KxE/lptlmJINlyxNQ2dgCiPimWzGHdEDto5iPeuh0xh20Uvv9lCXJgXE6RGx5E Oeo9EPH7xMVFo5ZYgk3Dv3oO52uRYwu9QAE6T07gfoljFgNP2LO+SBKqHLRRMYjg8jLl WrCIVilm7c0hK56Pu8SgqJS2DnZ0C/rgxI6TGpdrOKFGtp9pfKyhhWIfk9C+Z8NSefX0 DnPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=XS+com/RSQZy2+JQtdev4NEQ6e1SXR91qf6ZkTUSCfk=; b=rumUVrp7m2snLcB7uVThdcONYZbBwWRWgutOyLdRoGoQ/h47KlhkXInS1ud2aozNsk PhTNvUmVNS99NVm1wo0AM/tawPfRWEWRe2Byl6EX5divUDBpOAxkBoWdEGv5W5Yzxorg H5vSWrC7o6IAYI2OLaNS5KGVTJAA9KEmz2dRxZTwNunYespIIVI28xgbf7pNjXbe7SVf 7bC2t9Qvf6dmOIidYJPEFAGpNbmDkWH2BgbjTIqgTKeMIb/Pzta5UJLSlIM+NEqVjJGf PwfyFwNTdUPQd9yq9n3LA732N+Qk89sNzjzrlneGn1SP90CxoMu5EnogG6IQoI6KCre1 pivQ== X-Gm-Message-State: AFeK/H2LTJc7tqDe7CCa+Dklxw1D+b0jGST5CKSM6OtgLJwQMBawXt+UEsBCaYZPxa6VhutF2IpwZekUF6fgCw== X-Received: by 10.28.72.193 with SMTP id v184mr1751397wma.105.1489741774361; Fri, 17 Mar 2017 02:09:34 -0700 (PDT) MIME-Version: 1.0 Received: by 10.80.151.226 with HTTP; Fri, 17 Mar 2017 02:08:53 -0700 (PDT) In-Reply-To: References: <9B2FB1B7-3C8A-47A1-9DD5-B242FFB60A09@apache.org> From: Sergey Chugunov Date: Fri, 17 Mar 2017 12:08:53 +0300 Message-ID: Subject: Re: IGNITE-4536 metrics of new offheap storage To: dev@ignite.apache.org Cc: Alexey Goncharuk Content-Type: multipart/alternative; boundary=001a114afa4e449e22054ae989a2 archived-at: Fri, 17 Mar 2017 09:09:41 -0000 --001a114afa4e449e22054ae989a2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Dmitriy, My main goal was to add a metric to estimate FreeList space fragmentation and "hist" was the first thing I came up with. Let's consider one case: we placed into a cache 4 entities 60% page size each. After that we'll have 4 pages in FreeList each with a hole of 40% of it's size. Utilization of FreeList will be 60% but with big fragmentation. Let's consider another case: we have added and removed a bunch of entries much smaller than a page. After that we have two pages 90% full, one page 50% full and one page 10% full. Utilization of FreeList is 60% again, very simple math, but fragmentation is much smaller. So, when we calculate only a simple average we lose a lot of information; and this information may be very useful to make a decision about best page size configuration. Thanks, Sergey. On Thu, Mar 16, 2017 at 10:22 PM, Dmitriy Setrakyan wrote: > As far as the percentage of the free page space, why do we need to provid= e > 3 ranges: 0 -> 16, 16 -> 32, 32 -> 64, etc? Why not just provide average > free bytes percentage as one value? > > Am I misunderstanding something? > > On Thu, Mar 16, 2017 at 11:04 AM, Denis Magda wrote: > > > Sergey, > > > > Considering that the swap tier will no longer be supported in 2.0 all t= he > > methods that start with =E2=80=98getSwap=E2=80=A6=E2=80=99 are no longe= r relevant and have to be > > removed from metrics. For instance, the swap functionality has already > been > > wiped out from .NET: > > https://issues.apache.org/jira/browse/IGNITE-4736 > > > > Next, I=E2=80=99m also confused with the metrics that include =E2=80=98= Dht=E2=80=99 in its name. > > The on-heap tier we have in 1.x will be replaced with on-heap cache: > > https://issues.apache.org/jira/browse/IGNITE-4535 < > > https://issues.apache.org/jira/browse/IGNITE-4535> > > Does it me that =E2=80=98Dht=E2=80=99 methods are still relevant or the= y need to be > > replaced with something more meaningful? *Alex G.*, please chime in. > > > > Finally, personally I don=E2=80=99t like the API for these 3 methods > > > > > > > > public float getPagesPercentage_8_16_freeBytes(); > > > public float getPagesPercentage_16_64_freeBytes(); > > > public float getPagesPercentage_64_256_freeBytes(); > > > > Wouldn=E2=80=99t it better to have a single method like this? > > > > public float[] getPagesFreeBytesPercentage(); > > > > where > > > > float[0] - 0 to 16 free bytes. > > float[1] - 16 to 32 free bytes. > > float[2] - 32 to 64 free bytes. > > =E2=80=A6.. > > float[N] - page_size - 16 to page size free bytes. > > > > =E2=80=94 > > Denis > > > > > On Mar 16, 2017, at 10:22 AM, Sergey Chugunov < > sergey.chugunov@gmail.com> > > wrote: > > > > > > Denis, > > > > > > Here is a version of CacheMetrics interface with all changes how I se= e > > them > > > (pretty long list :)). > > > > > > public interface CacheMetrics { > > > > > > public long getCacheHits(); > > > > > > public float getCacheHitPercentage(); > > > > > > public long getCacheMisses(); > > > > > > public float getCacheMissPercentage(); > > > > > > public long getCacheGets(); > > > > > > public long getCachePuts(); > > > > > > public long getCacheRemovals(); > > > > > > public long getCacheEvictions(); > > > > > > public float getAverageGetTime(); > > > > > > public float getAveragePutTime(); > > > > > > public float getAverageRemoveTime(); > > > > > > public float getAverageTxCommitTime(); > > > > > > public float getAverageTxRollbackTime(); > > > > > > public long getCacheTxCommits(); > > > > > > public long getCacheTxRollbacks(); > > > > > > public String name(); > > > > > > public long getOverflowSize(); > > > > > > public long getOffHeapGets(); > > > > > > public long getOffHeapPuts();//removing as it duplicates cachePuts > > > > > > public long getOffHeapRemovals(); > > > > > > public long getOffHeapEvictions(); > > > > > > public long getOffHeapHits(); > > > > > > public float getOffHeapHitPercentage(); > > > > > > public long getOffHeapMisses();//removing as it duplicates > cacheMisses > > > > > > public float getOffHeapMissPercentage();//removing as it duplicates > > > cacheMissPercentage > > > > > > public long getOffHeapEntriesCount(); > > > > > > public long getOffHeapPrimaryEntriesCount(); > > > > > > public long getOffHeapBackupEntriesCount(); > > > > > > public long getOffHeapAllocatedSize(); > > > > > > public long getOffHeapMaxSize(); > > > > > > public long getSwapGets(); > > > > > > public long getSwapPuts(); > > > > > > public long getSwapRemovals(); > > > > > > public long getSwapHits(); > > > > > > public long getSwapMisses(); > > > > > > public long getSwapEntriesCount(); > > > > > > public long getSwapSize(); > > > > > > public float getSwapHitPercentage(); > > > > > > public float getSwapMissPercentage(); > > > > > > public int getSize(); > > > > > > public int getKeySize(); > > > > > > public boolean isEmpty(); > > > > > > public int getDhtEvictQueueCurrentSize(); > > > > > > public int getTxThreadMapSize(); > > > > > > public int getTxXidMapSize(); > > > > > > public int getTxCommitQueueSize(); > > > > > > public int getTxPrepareQueueSize(); > > > > > > public int getTxStartVersionCountsSize(); > > > > > > public int getTxCommittedVersionsSize(); > > > > > > public int getTxRolledbackVersionsSize(); > > > > > > public int getTxDhtThreadMapSize(); > > > > > > public int getTxDhtXidMapSize(); > > > > > > public int getTxDhtCommitQueueSize(); > > > > > > public int getTxDhtPrepareQueueSize(); > > > > > > public int getTxDhtStartVersionCountsSize(); > > > > > > public int getTxDhtCommittedVersionsSize(); > > > > > > public int getTxDhtRolledbackVersionsSize(); > > > > > > public boolean isWriteBehindEnabled(); > > > > > > public int getWriteBehindFlushSize(); > > > > > > public int getWriteBehindFlushThreadCount(); > > > > > > public long getWriteBehindFlushFrequency(); > > > > > > public int getWriteBehindStoreBatchSize(); > > > > > > public int getWriteBehindTotalCriticalOverflowCount(); > > > > > > public int getWriteBehindCriticalOverflowCount(); > > > > > > public int getWriteBehindErrorRetryCount(); > > > > > > public int getWriteBehindBufferSize(); > > > > > > public String getKeyType(); > > > > > > public String getValueType(); > > > > > > public boolean isStoreByValue(); > > > > > > public boolean isStatisticsEnabled(); > > > > > > public boolean isManagementEnabled(); > > > > > > public boolean isReadThrough(); > > > > > > public boolean isWriteThrough(); > > > > > > public long getTotalAllocatedPages(); > > > > > > public long getTotalEvictedPages(); > > > > > > } > > > > > > > > > Also I suggest to introduce new interface for MemoryPolicy metrics an= d > > make > > > it available through *IgniteCacheDatabaseSharedManager*: > > > > > > > > > public interface IgniteMemoryPolicyMetrics { > > > > > > /** > > > > > > * @return Memory policy name. > > > > > > */ > > > > > > public String getName(); > > > > > > > > > /** > > > > > > * @return Total number of allocated pages. > > > > > > */ > > > > > > public long getTotalAllocatedPages(); > > > > > > > > > /** > > > > > > * @return Amount (in bytes) of not yet allocated space in > PageMemory. > > > > > > */ > > > > > > public long getAvailableSpace(); > > > > > > > > > /** > > > > > > * @return Number of allocated pages per second within PageMemory. > > > > > > */ > > > > > > public float getAllocationRate(); > > > > > > > > > /** > > > > > > * @return Number of evicted pages per second within PageMemory. > > > > > > */ > > > > > > public float getEvictionRate(); > > > > > > > > > /** > > > > > > * Large entities bigger than page are split into fragments so eac= h > > > fragment can fit into a page. > > > > > > * > > > > > > * @return Percentage of pages fully occupied by large entities. > > > > > > */ > > > > > > public long getLargeEntriesPagesPercentage(); > > > > > > > > > //---FreeList-related metrics > > > > > > > > > /** > > > > > > * @return Free space to overall size ratio across all pages in > > > FreeList. > > > > > > */ > > > > > > public float getPagesFillFactor(); > > > > > > > > > /** > > > > > > * @return Percentage of pages in FreeList with free space >=3D 8 = and > < > > > 16 bytes > > > > > > */ > > > > > > public float getPagesPercentage_8_16_freeBytes(); > > > > > > > > > /** > > > > > > * @return Percentage of pages in FreeList with free space >=3D 16 > and < > > > 64 bytes > > > > > > */ > > > > > > public float getPagesPercentage_16_64_freeBytes(); > > > > > > > > > /** > > > > > > * @return Percentage of pages in FreeList with free space >=3D 64 > and < > > > 256 bytes > > > > > > */ > > > > > > public float getPagesPercentage_64_256_freeBytes(); > > > > > > } > > > > > > In my mind last three methods provide some kind of hist to give an > > insight > > > about memory fragmentation. > > > If there are a lot of pages with relatively big free chunks and less > > with a > > > smaller chunks it may indicate that memory is fragmented and it may b= e > > > reasonable to adjust page sizes. > > > > > > Thanks, > > > Sergey. > > > > > > > > > > > > On Thu, Mar 16, 2017 at 1:29 AM, Denis Magda > wrote: > > > > > >> Hi Sergey, > > >> > > >>>> In memory management scheme based on MemoryPolicies it may be usef= ul > > >> (and > > >>>> easier) to collect some metrics not for individual caches but for > > whole > > >>>> MemoryPolicies where several caches may reside. > > >>>> > > >> > > >> I would collect the metrics for every single MemoryPolicy as well as > for > > >> individual caches. It makes sense to expose which cache contributes > > more to > > >> memory utilization. > > >> > > >>>> - free space / used space tracking; > > >>>> - allocation / eviction rate; > > >> > > >> Please consider this as well: > > >> - total number of pages; > > >> - total number of enters (how hard to support?). > > >> > > >>>> - metrics to track memory fragmentation: e.g. % of pages with onl= y > 8 > > >>>> bytes free, 16 bytes free and so on; > > >>>> - % of big fragmented entries in cache: may be useful to adjust > page > > >>>> size. > > >> > > >>> > > >> How do you see this in the metrics interface? > > >> > > >> > > >>> 3. Useful, not going to remove: > > >>> getOffHeapGets //useful as there still may be deserialized entries > > >>> residing on-heap > > >>> getOffHeapHitPercentage > > >>> getOffHeapHits //overall hits include offheap and onheap > > >>> getOffHeapMisses //I think in new model is the same as > getCacheMisses > > >>> getOffHeapMissPercentage //same as above > > >>> getOffHeapPuts //same as above > > >>> getOffHeapRemovals //same as above > > >> > > >> Could you please prepare an updated version of the cache metrics > adding > > >> new methods and renaming existing ones (only if necessary)? It will = be > > >> simpler to keep up the discussion relying on this updated interface. > > >> > > >> =E2=80=94 > > >> Denis > > >> > > >>> On Mar 15, 2017, at 8:32 AM, Sergey Chugunov < > > sergey.chugunov@gmail.com> > > >> wrote: > > >>> > > >>> Also I looked through current set of metrics available on > > >>> *CacheMetrics *interface > > >>> and suggest following changes: > > >>> > > >>> > > >>> 1. All methods related to tracking swap space (including > > >>> *getOverflowSize*) to be removed. > > >>> > > >>> 2. Useless/hard to calculate in new memory management approach: > > >>> getOffHeapAllocatedSize //max size is constrained by MemoryPolicy > > >> config > > >>> getOffHeapEntriesCount //all cache entries live offheap > > >>> getOffHeapEvictions //will be captured on MemoryPolicyMetrics leve= l; > > >>> getOffHeapMaxSize //same as the first one > > >>> > > >>> 3. Useful, not going to remove: > > >>> getOffHeapGets //useful as there still may be deserialized entries > > >>> residing on-heap > > >>> getOffHeapHitPercentage > > >>> getOffHeapHits //overall hits include offheap and onheap > > >>> getOffHeapMisses //I think in new model is the same as > getCacheMisses > > >>> getOffHeapMissPercentage //same as above > > >>> getOffHeapPuts //same as above > > >>> getOffHeapRemovals //same as above > > >>> > > >>> Please share your thought if I miss something here. > > >>> > > >>> Thanks, > > >>> Sergey Chugunov. > > >>> > > >>> On Wed, Mar 15, 2017 at 4:51 PM, Sergey Chugunov < > > >> sergey.chugunov@gmail.com> > > >>> wrote: > > >>> > > >>>> Hello Igniters, > > >>>> > > >>>> As part of [1] cache metrics need to be updated as some of them li= ke > > >> swap > > >>>> hits are not applicable anymore. > > >>>> > > >>>> In memory management scheme based on MemoryPolicies it may be usef= ul > > >> (and > > >>>> easier) to collect some metrics not for individual caches but for > > whole > > >>>> MemoryPolicies where several caches may reside. > > >>>> > > >>>> I suggest the following list of new metrics to collect for each > > >>>> MemoryPolicy: > > >>>> > > >>>> - free space / used space tracking; > > >>>> - allocation / eviction rate; > > >>>> - metrics to track memory fragmentation: e.g. % of pages with onl= y > 8 > > >>>> bytes free, 16 bytes free and so on; > > >>>> - % of big fragmented entries in cache: may be useful to adjust > page > > >>>> size. > > >>>> > > >>>> > > >>>> Please suggest any other metrics that may be worth tracking. > > >>>> > > >>>> [1] https://issues.apache.org/jira/browse/IGNITE-3477 > > >>>> > > >>>> Thanks, > > >>>> Sergey Chugunov. > > >>>> > > >> > > >> > > > > > --001a114afa4e449e22054ae989a2--