ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Rakov (Jira)" <j...@apache.org>
Subject [jira] [Comment Edited] (IGNITE-6930) Optionally to do not write free list updates to WAL
Date Thu, 03 Oct 2019 18:18:02 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943806#comment-16943806
] 

Ivan Rakov edited comment on IGNITE-6930 at 10/3/19 6:17 PM:
-------------------------------------------------------------

[~alex_pl], I've taken a look. Some comments:
1) testRestoreFreeListCorrectlyAfterRandomStop - why do we need to disable caching here?
2) testFreeListUnderLoadMultipleCheckpoints - what is being tested? I think, we need to add
comment that test is intended to cover weakened pageId != 0 assertion.
3) MAX_SIZE, STRIPES_COUNT - don't you think that we should make these options configurable?
4) How did you choose 64 and 4 as defaults? Can you share some benchmarks? I think that 64
might be on overkill: in data load scenario, data pages traverse from biggest to lowest buckets
by turn. I don't think that pages are likely to heavily accumulate in a certain bucket; maybe
8 as MAX_SIZE would show the same performance boost.
5) PagesList.PagesCache#flush: do we need to garbage-collect all allocated long lists when
we flush page cache? We can just clear() them and reuse again after the checkpoint. It should
reduce GC pressure.


was (Author: ivan.glukos):
[~alex_pl], I've take a look. Some comments:
1) testRestoreFreeListCorrectlyAfterRandomStop - why do we need to disable caching here?
2) testFreeListUnderLoadMultipleCheckpoints - what is being tested? I think, we need to add
comment that test is intended to cover weakened pageId != 0 assertion.
3) MAX_SIZE, STRIPES_COUNT - don't you think that we should make these options configurable?
4) How did you choose 64 and 4 as defaults? Can you share some benchmarks? I think that 64
might be on overkill: in data load scenario, data pages traverse from biggest to lowest buckets
by turn. I don't think that pages are likely to heavily accumulate in a certain bucket; maybe
8 as MAX_SIZE would show the same performance boost.
5) PagesList.PagesCache#flush: do we need to garbage-collect all allocated long lists when
we flush page cache? We can just clear() them and reuse again after the checkpoint. It should
reduce GC pressure.

> Optionally to do not write free list updates to WAL
> ---------------------------------------------------
>
>                 Key: IGNITE-6930
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6930
>             Project: Ignite
>          Issue Type: Task
>          Components: cache
>            Reporter: Vladimir Ozerov
>            Assignee: Aleksey Plekhanov
>            Priority: Major
>              Labels: IEP-8, performance
>             Fix For: 2.8
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When cache entry is created, we need to write update the free list. When entry is updated,
we need to update free list(s) several times. Currently free list is persistent structure,
so every update to it must be logged to be able to recover after crash. This may incur significant
overhead, especially for small entries.
> E.g. this is how WAL for a single update looks like. "D" - updates with real data, "F"
- free-list management:
> {code}
>  1. [D] DataRecord [writeEntries=[UnwrapDataEntry[k = key, v = [ BinaryObject [idHash=2053299190,
hash=1986931360, typeId=-1580729813]], super = [DataEntry [cacheId=94416770, op=UPDATE, writeVer=GridCacheVersion
[topVer=122147562, order=1510667560607, nodeOrder=1], partId=0, partCnt=4]]]], super=WALRecord
[size=0, chainSize=0, pos=null, type=DATA_RECORD]]
>  2. [F] PagesListRemovePageRecord [rmvdPageId=0001000000000005, pageId=0001000000000006,
grpId=94416770, super=PageDeltaRecord [grpId=94416770, pageId=0001000000000006, super=WALRecord
[size=37, chainSize=0, pos=null, type=PAGES_LIST_REMOVE_PAGE]]]
>  3. [D] DataPageInsertRecord [super=PageDeltaRecord [grpId=94416770, pageId=0001000000000005,
super=WALRecord [size=129, chainSize=0, pos=null, type=DATA_PAGE_INSERT_RECORD]]]
>  4. [F] PagesListAddPageRecord [dataPageId=0001000000000005, super=PageDeltaRecord [grpId=94416770,
pageId=0001000000000008, super=WALRecord [size=37, chainSize=0, pos=null, type=PAGES_LIST_ADD_PAGE]]]
>  5. [F] DataPageSetFreeListPageRecord [freeListPage=281474976710664, super=PageDeltaRecord
[grpId=94416770, pageId=0001000000000005, super=WALRecord [size=37, chainSize=0, pos=null,
type=DATA_PAGE_SET_FREE_LIST_PAGE]]]
>  6. [D] ReplaceRecord [io=DataLeafIO[ver=1], idx=0, super=PageDeltaRecord [grpId=94416770,
pageId=0001000000000004, super=WALRecord [size=47, chainSize=0, pos=null, type=BTREE_PAGE_REPLACE]]]
>  7. [F] DataPageRemoveRecord [itemId=0, super=PageDeltaRecord [grpId=94416770, pageId=0001000000000005,
super=WALRecord [size=30, chainSize=0, pos=null, type=DATA_PAGE_REMOVE_RECORD]]]
>  8. [F] PagesListRemovePageRecord [rmvdPageId=0001000000000005, pageId=0001000000000008,
grpId=94416770, super=PageDeltaRecord [grpId=94416770, pageId=0001000000000008, super=WALRecord
[size=37, chainSize=0, pos=null, type=PAGES_LIST_REMOVE_PAGE]]]
>  9. [F] DataPageSetFreeListPageRecord [freeListPage=0, super=PageDeltaRecord [grpId=94416770,
pageId=0001000000000005, super=WALRecord [size=37, chainSize=0, pos=null, type=DATA_PAGE_SET_FREE_LIST_PAGE]]]
> 10. [F] PagesListAddPageRecord [dataPageId=0001000000000005, super=PageDeltaRecord [grpId=94416770,
pageId=0001000000000006, super=WALRecord [size=37, chainSize=0, pos=null, type=PAGES_LIST_ADD_PAGE]]]
> 11. [F] DataPageSetFreeListPageRecord [freeListPage=281474976710662, super=PageDeltaRecord
[grpId=94416770, pageId=0001000000000005, super=WALRecord [size=37, chainSize=0, pos=null,
type=DATA_PAGE_SET_FREE_LIST_PAGE]]]
> {code}
> If you sum all space required for operation (size in p.3 is shown incorrectly here),
you will see that data update required ~300 bytes, so do free list update! 
> *Proposed solution*
> 1) Optionally do not write free list updates to WAL
> 2) In case of node restart we start with empty free lists, so data inserts will have
to allocate new pages
> 3) When old data page is read, add it to the free list
> 4) Start a background thread which will iterate over all old data pages and re-create
the free list, so that eventually all data pages are tracked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message