ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Chugunov <sergey.chugu...@gmail.com>
Subject Re: IGNITE-4536 metrics of new offheap storage
Date Fri, 17 Mar 2017 09:08:53 GMT
Dmitriy,

My main goal was to add a metric to estimate FreeList space fragmentation
and "hist" was the first thing I came up with.

Let's consider one case: we placed into a cache 4 entities 60% page size
each.
After that we'll have 4 pages in FreeList each with a hole of 40% of it's
size.
Utilization of FreeList will be 60% but with big fragmentation.

Let's consider another case: we have added and removed a bunch of entries
much smaller than a page. After that we have two pages 90% full, one page
50% full and one page 10% full.
Utilization of FreeList is 60% again, very simple math, but fragmentation
is much smaller.

So, when we calculate only a simple average we lose a lot of information;
and this information may be very useful to make a decision about best page
size configuration.

Thanks,
Sergey.


On Thu, Mar 16, 2017 at 10:22 PM, Dmitriy Setrakyan <dsetrakyan@apache.org>
wrote:

> As far as the percentage of the free page space, why do we need to provide
> 3 ranges: 0 -> 16, 16 -> 32, 32 -> 64, etc? Why not just provide average
> free bytes percentage as one value?
>
> Am I misunderstanding something?
>
> On Thu, Mar 16, 2017 at 11:04 AM, Denis Magda <dmagda@apache.org> wrote:
>
> > Sergey,
> >
> > Considering that the swap tier will no longer be supported in 2.0 all the
> > methods that start with ‘getSwap…’ are no longer relevant and have to be
> > removed from metrics. For instance, the swap functionality has already
> been
> > wiped out from .NET:
> > https://issues.apache.org/jira/browse/IGNITE-4736
> >
> > Next, I’m also confused with the metrics that include ‘Dht’ in its name.
> > The on-heap tier we have in 1.x will be replaced with on-heap cache:
> > https://issues.apache.org/jira/browse/IGNITE-4535 <
> > https://issues.apache.org/jira/browse/IGNITE-4535>
> > Does it me that ‘Dht’ methods are still relevant or they need to be
> > replaced with something more meaningful? *Alex G.*, please chime in.
> >
> > Finally, personally I don’t like the API for these 3 methods
> >
> > >
> > >    public float getPagesPercentage_8_16_freeBytes();
> > >    public float getPagesPercentage_16_64_freeBytes();
> > >    public float getPagesPercentage_64_256_freeBytes();
> >
> > Wouldn’t it better to have a single method like this?
> >
> > public float[] getPagesFreeBytesPercentage();
> >
> > where
> >
> > float[0] - 0 to 16 free bytes.
> > float[1] - 16 to 32 free bytes.
> > float[2] - 32 to 64 free bytes.
> > …..
> > float[N] - page_size - 16 to page size free bytes.
> >
> > —
> > Denis
> >
> > > On Mar 16, 2017, at 10:22 AM, Sergey Chugunov <
> sergey.chugunov@gmail.com>
> > wrote:
> > >
> > > Denis,
> > >
> > > Here is a version of CacheMetrics interface with all changes how I see
> > them
> > > (pretty long list :)).
> > >
> > > public interface CacheMetrics {
> > >
> > >   public long getCacheHits();
> > >
> > >   public float getCacheHitPercentage();
> > >
> > >   public long getCacheMisses();
> > >
> > >   public float getCacheMissPercentage();
> > >
> > >   public long getCacheGets();
> > >
> > >   public long getCachePuts();
> > >
> > >   public long getCacheRemovals();
> > >
> > >   public long getCacheEvictions();
> > >
> > >   public float getAverageGetTime();
> > >
> > >   public float getAveragePutTime();
> > >
> > >   public float getAverageRemoveTime();
> > >
> > >   public float getAverageTxCommitTime();
> > >
> > >   public float getAverageTxRollbackTime();
> > >
> > >   public long getCacheTxCommits();
> > >
> > >   public long getCacheTxRollbacks();
> > >
> > >   public String name();
> > >
> > >   public long getOverflowSize();
> > >
> > >   public long getOffHeapGets();
> > >
> > >   public long getOffHeapPuts();//removing as it duplicates cachePuts
> > >
> > >   public long getOffHeapRemovals();
> > >
> > >   public long getOffHeapEvictions();
> > >
> > >   public long getOffHeapHits();
> > >
> > >   public float getOffHeapHitPercentage();
> > >
> > >   public long getOffHeapMisses();//removing as it duplicates
> cacheMisses
> > >
> > >   public float getOffHeapMissPercentage();//removing as it duplicates
> > > cacheMissPercentage
> > >
> > >   public long getOffHeapEntriesCount();
> > >
> > >   public long getOffHeapPrimaryEntriesCount();
> > >
> > >   public long getOffHeapBackupEntriesCount();
> > >
> > >   public long getOffHeapAllocatedSize();
> > >
> > >   public long getOffHeapMaxSize();
> > >
> > >   public long getSwapGets();
> > >
> > >   public long getSwapPuts();
> > >
> > >   public long getSwapRemovals();
> > >
> > >   public long getSwapHits();
> > >
> > >   public long getSwapMisses();
> > >
> > >   public long getSwapEntriesCount();
> > >
> > >   public long getSwapSize();
> > >
> > >   public float getSwapHitPercentage();
> > >
> > >   public float getSwapMissPercentage();
> > >
> > >   public int getSize();
> > >
> > >   public int getKeySize();
> > >
> > >   public boolean isEmpty();
> > >
> > >   public int getDhtEvictQueueCurrentSize();
> > >
> > >   public int getTxThreadMapSize();
> > >
> > >   public int getTxXidMapSize();
> > >
> > >   public int getTxCommitQueueSize();
> > >
> > >   public int getTxPrepareQueueSize();
> > >
> > >   public int getTxStartVersionCountsSize();
> > >
> > >   public int getTxCommittedVersionsSize();
> > >
> > >   public int getTxRolledbackVersionsSize();
> > >
> > >   public int getTxDhtThreadMapSize();
> > >
> > >   public int getTxDhtXidMapSize();
> > >
> > >   public int getTxDhtCommitQueueSize();
> > >
> > >   public int getTxDhtPrepareQueueSize();
> > >
> > >   public int getTxDhtStartVersionCountsSize();
> > >
> > >   public int getTxDhtCommittedVersionsSize();
> > >
> > >   public int getTxDhtRolledbackVersionsSize();
> > >
> > >   public boolean isWriteBehindEnabled();
> > >
> > >   public int getWriteBehindFlushSize();
> > >
> > >   public int getWriteBehindFlushThreadCount();
> > >
> > >   public long getWriteBehindFlushFrequency();
> > >
> > >   public int getWriteBehindStoreBatchSize();
> > >
> > >   public int getWriteBehindTotalCriticalOverflowCount();
> > >
> > >   public int getWriteBehindCriticalOverflowCount();
> > >
> > >   public int getWriteBehindErrorRetryCount();
> > >
> > >   public int getWriteBehindBufferSize();
> > >
> > >   public String getKeyType();
> > >
> > >   public String getValueType();
> > >
> > >   public boolean isStoreByValue();
> > >
> > >   public boolean isStatisticsEnabled();
> > >
> > >   public boolean isManagementEnabled();
> > >
> > >   public boolean isReadThrough();
> > >
> > >   public boolean isWriteThrough();
> > >
> > >   public long getTotalAllocatedPages();
> > >
> > >   public long getTotalEvictedPages();
> > >
> > > }
> > >
> > >
> > > Also I suggest to introduce new interface for MemoryPolicy metrics and
> > make
> > > it available through *IgniteCacheDatabaseSharedManager*:
> > >
> > >
> > > public interface IgniteMemoryPolicyMetrics {
> > >
> > >    /**
> > >
> > >     * @return Memory policy name.
> > >
> > >     */
> > >
> > >    public String getName();
> > >
> > >
> > >    /**
> > >
> > >     * @return Total number of allocated pages.
> > >
> > >     */
> > >
> > >    public long getTotalAllocatedPages();
> > >
> > >
> > >    /**
> > >
> > >     * @return Amount (in bytes) of not yet allocated space in
> PageMemory.
> > >
> > >     */
> > >
> > >    public long getAvailableSpace();
> > >
> > >
> > >    /**
> > >
> > >     * @return Number of allocated pages per second within PageMemory.
> > >
> > >     */
> > >
> > >    public float getAllocationRate();
> > >
> > >
> > >    /**
> > >
> > >     * @return Number of evicted pages per second within PageMemory.
> > >
> > >     */
> > >
> > >    public float getEvictionRate();
> > >
> > >
> > >    /**
> > >
> > >     * Large entities bigger than page are split into fragments so each
> > > fragment can fit into a page.
> > >
> > >     *
> > >
> > >     * @return Percentage of pages fully occupied by large entities.
> > >
> > >     */
> > >
> > >    public long getLargeEntriesPagesPercentage();
> > >
> > >
> > >    //---FreeList-related metrics
> > >
> > >
> > >    /**
> > >
> > >     * @return Free space to overall size ratio across all pages in
> > > FreeList.
> > >
> > >     */
> > >
> > >    public float getPagesFillFactor();
> > >
> > >
> > >    /**
> > >
> > >     * @return Percentage of pages in FreeList with free space >= 8 and
> <
> > > 16 bytes
> > >
> > >     */
> > >
> > >    public float getPagesPercentage_8_16_freeBytes();
> > >
> > >
> > >    /**
> > >
> > >     * @return Percentage of pages in FreeList with free space >= 16
> and <
> > > 64 bytes
> > >
> > >     */
> > >
> > >    public float getPagesPercentage_16_64_freeBytes();
> > >
> > >
> > >    /**
> > >
> > >     * @return Percentage of pages in FreeList with free space >= 64
> and <
> > > 256 bytes
> > >
> > >     */
> > >
> > >    public float getPagesPercentage_64_256_freeBytes();
> > >
> > > }
> > >
> > > In my mind last three methods provide some kind of hist to give an
> > insight
> > > about memory fragmentation.
> > > If there are a lot of pages with relatively big free chunks and less
> > with a
> > > smaller chunks it may indicate that memory is fragmented and it may be
> > > reasonable to adjust page sizes.
> > >
> > > Thanks,
> > > Sergey.
> > >
> > >
> > >
> > > On Thu, Mar 16, 2017 at 1:29 AM, Denis Magda <dmagda@apache.org>
> wrote:
> > >
> > >> Hi Sergey,
> > >>
> > >>>> In memory management scheme based on MemoryPolicies it may be useful
> > >> (and
> > >>>> easier) to collect some metrics not for individual caches but for
> > whole
> > >>>> MemoryPolicies where several caches may reside.
> > >>>>
> > >>
> > >> I would collect the metrics for every single MemoryPolicy as well as
> for
> > >> individual caches. It makes sense to expose which cache contributes
> > more to
> > >> memory utilization.
> > >>
> > >>>>  - free space / used space tracking;
> > >>>>  - allocation / eviction rate;
> > >>
> > >> Please consider this as well:
> > >> - total number of pages;
> > >> - total number of enters (how hard to support?).
> > >>
> > >>>>  - metrics to track memory fragmentation: e.g. % of pages with
only
> 8
> > >>>>  bytes free, 16 bytes free and so on;
> > >>>>  - % of big fragmented entries in cache: may be useful to adjust
> page
> > >>>>  size.
> > >>
> > >>>
> > >> How do you see this in the metrics interface?
> > >>
> > >>
> > >>>  3. Useful, not going to remove:
> > >>>  getOffHeapGets //useful as there still may be deserialized entries
> > >>>  residing on-heap
> > >>>  getOffHeapHitPercentage
> > >>>  getOffHeapHits //overall hits include offheap and onheap
> > >>>  getOffHeapMisses //I think in new model is the same as
> getCacheMisses
> > >>>  getOffHeapMissPercentage //same as above
> > >>>  getOffHeapPuts //same as above
> > >>>  getOffHeapRemovals //same as above
> > >>
> > >> Could you please prepare an updated version of the cache metrics
> adding
> > >> new methods and renaming existing ones (only if necessary)? It will be
> > >> simpler to keep up the discussion relying on this updated interface.
> > >>
> > >> —
> > >> Denis
> > >>
> > >>> On Mar 15, 2017, at 8:32 AM, Sergey Chugunov <
> > sergey.chugunov@gmail.com>
> > >> wrote:
> > >>>
> > >>> Also I looked through current set of metrics available on
> > >>> *CacheMetrics *interface
> > >>> and suggest following changes:
> > >>>
> > >>>
> > >>>  1. All methods related to tracking swap space (including
> > >>>  *getOverflowSize*) to be removed.
> > >>>
> > >>>  2. Useless/hard to calculate in new memory management approach:
> > >>>  getOffHeapAllocatedSize //max size is constrained by MemoryPolicy
> > >> config
> > >>>  getOffHeapEntriesCount //all cache entries live offheap
> > >>>  getOffHeapEvictions //will be captured on MemoryPolicyMetrics level;
> > >>>  getOffHeapMaxSize //same as the first one
> > >>>
> > >>>  3. Useful, not going to remove:
> > >>>  getOffHeapGets //useful as there still may be deserialized entries
> > >>>  residing on-heap
> > >>>  getOffHeapHitPercentage
> > >>>  getOffHeapHits //overall hits include offheap and onheap
> > >>>  getOffHeapMisses //I think in new model is the same as
> getCacheMisses
> > >>>  getOffHeapMissPercentage //same as above
> > >>>  getOffHeapPuts //same as above
> > >>>  getOffHeapRemovals //same as above
> > >>>
> > >>> Please share your thought if I miss something here.
> > >>>
> > >>> Thanks,
> > >>> Sergey Chugunov.
> > >>>
> > >>> On Wed, Mar 15, 2017 at 4:51 PM, Sergey Chugunov <
> > >> sergey.chugunov@gmail.com>
> > >>> wrote:
> > >>>
> > >>>> Hello Igniters,
> > >>>>
> > >>>> As part of [1] cache metrics need to be updated as some of them
like
> > >> swap
> > >>>> hits are not applicable anymore.
> > >>>>
> > >>>> In memory management scheme based on MemoryPolicies it may be useful
> > >> (and
> > >>>> easier) to collect some metrics not for individual caches but for
> > whole
> > >>>> MemoryPolicies where several caches may reside.
> > >>>>
> > >>>> I suggest the following list of new metrics to collect for each
> > >>>> MemoryPolicy:
> > >>>>
> > >>>>  - free space / used space tracking;
> > >>>>  - allocation / eviction rate;
> > >>>>  - metrics to track memory fragmentation: e.g. % of pages with
only
> 8
> > >>>>  bytes free, 16 bytes free and so on;
> > >>>>  - % of big fragmented entries in cache: may be useful to adjust
> page
> > >>>>  size.
> > >>>>
> > >>>>
> > >>>> Please suggest any other metrics that may be worth tracking.
> > >>>>
> > >>>> [1] https://issues.apache.org/jira/browse/IGNITE-3477
> > >>>>
> > >>>> Thanks,
> > >>>> Sergey Chugunov.
> > >>>>
> > >>
> > >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message