ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anton Vinogradov ...@apache.org>
Subject Re: How to free up space on disc after removing entries from IgniteCache with enabled PDS?
Date Mon, 30 Sep 2019 06:28:54 GMT
Alexei,
>> stopping fragmented node and removing partition data, then starting it
again

That's exactly what we're doing to solve the fragmentation issue.
The problem here is that we have to perform N/B restart-rebalance
operations (N - cluster size, B - backups count) and it takes a lot of time
with risks to lose the data.

On Fri, Sep 27, 2019 at 5:49 PM Alexei Scherbakov <
alexey.scherbakoff@gmail.com> wrote:

> Probably this should be allowed to do using public API, actually this is
> same as manual rebalancing.
>
> пт, 27 сент. 2019 г. в 17:40, Alexei Scherbakov <
> alexey.scherbakoff@gmail.com>:
>
> > The poor man's solution for the problem would be stopping fragmented node
> > and removing partition data, then starting it again allowing full state
> > transfer already without deletes.
> > Rinse and repeat for all owners.
> >
> > Anton Vinogradov, would this work for you as workaround ?
> >
> > чт, 19 сент. 2019 г. в 13:03, Anton Vinogradov <av@apache.org>:
> >
> >> Alexey,
> >>
> >> Let's combine your and Ivan's proposals.
> >>
> >> >> vacuum command, which acquires exclusive table lock, so no concurrent
> >> activities on the table are possible.
> >> and
> >> >> Could the problem be solved by stopping a node which needs to be
> >> defragmented, clearing persistence files and restarting the node?
> >> >> After rebalancing the node will receive all data back without
> >> fragmentation.
> >>
> >> How about to have special partition state SHRINKING?
> >> This state should mean that partition unavailable for reads and updates
> >> but
> >> should keep it's update-counters and should not be marked as lost,
> renting
> >> or evicted.
> >> At this state we able to iterate over the partition and apply it's
> entries
> >> to another file in a compact way.
> >> Indices should be updated during the copy-on-shrink procedure or at the
> >> shrink completion.
> >> Once shrank file is ready we should replace the original partition file
> >> with it and mark it as MOVING which will start the historical rebalance.
> >> Shrinking should be performed during the low activity periods, but even
> in
> >> case we found that activity was high and historical rebalance is not
> >> suitable we may just remove the file and use regular rebalance to
> restore
> >> the partition (this will also lead to shrink).
> >>
> >> BTW, seems, we able to implement partition shrink in a cheap way.
> >> We may just use rebalancing code to apply fat partition's entries to the
> >> new file.
> >> So, 3 stages here: local rebalance, indices update and global historical
> >> rebalance.
> >>
> >> On Thu, Sep 19, 2019 at 11:43 AM Alexey Goncharuk <
> >> alexey.goncharuk@gmail.com> wrote:
> >>
> >> > Anton,
> >> >
> >> >
> >> > > >>  The solution which Anton suggested does not look easy because
it
> >> will
> >> > > most likely significantly hurt performance
> >> > > Mostly agree here, but what drop do we expect? What price do we
> ready
> >> to
> >> > > pay?
> >> > > Not sure, but seems some vendors ready to pay, for example, 5% drop
> >> for
> >> > > this.
> >> >
> >> > 5% may be a big drop for some use-cases, so I think we should look at
> >> how
> >> > to improve performance, not how to make it worse.
> >> >
> >> >
> >> > >
> >> > > >> it is hard to maintain a data structure to choose "page from
> >> free-list
> >> > > with enough space closest to the beginning of the file".
> >> > > We can just split each free-list bucket to the couple and use first
> >> for
> >> > > pages in the first half of the file and the second for the last.
> >> > > Only two buckets required here since, during the file shrink, first
> >> > > bucket's window will be shrank too.
> >> > > Seems, this give us the same price on put, just use the first bucket
> >> in
> >> > > case it's not empty.
> >> > > Remove price (with merge) will be increased, of course.
> >> > >
> >> > > The compromise solution is to have priority put (to the first path
> of
> >> the
> >> > > file), with keeping removal as is, and schedulable per-page
> migration
> >> for
> >> > > the rest of the data during the low activity period.
> >> > >
> >> > Free lists are large and slow by themselves, it is expensive to
> >> checkpoint
> >> > and read them on start, so as a long-term solution I would look into
> >> > removing them. Moreover, not sure if adding yet another background
> >> process
> >> > will improve the codebase reliability and simplicity.
> >> >
> >> > If we want to go the hard path, I would look at free page tracking
> >> bitmap -
> >> > a special bitmask page, where each page in an adjacent block is marked
> >> as 0
> >> > if it has free space more than a certain configurable threshold (say,
> >> 80%)
> >> > - free, and 1 if less (full). Some vendors have successfully
> implemented
> >> > this approach, which looks much more promising, but harder to
> implement.
> >> >
> >> > --AG
> >> >
> >>
> >
> >
> > --
> >
> > Best regards,
> > Alexei Scherbakov
> >
>
>
> --
>
> Best regards,
> Alexei Scherbakov
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message