ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Plehanov <plehanov.a...@gmail.com>
Subject Re: Disk page compression for Ignite persistent store
Date Mon, 11 Mar 2019 13:59:00 GMT
Hello Igniters!

I've implemented compression of WAL page snapshot records. Ticket [1], PR
[2]. I've used page compression module implemented by Sergi Vladykin for
page store.

To configure WAL page records compression there are 2 properties added to
DataStorageConfiguration: walPageCompression and walPageCompressionLevel.
Unlike page store compression, WAL compression doesn't use sparse files and
can be used on any file system (there is also not necessary to enable page
store compression to enable WAL page records compression).

WAL page snapshot compression is useful and performing best when there are
many caches and partitions used by Ignite instance. In this case, page
snapshot records take a considerable part of WAL (in my tests it's more
than 90% of WAL size).

I've done some benchmarks using the yardstick framework and got pretty good
results. It's not only WAL size significantly reduced, but there is also an
improvement in throughput and latency. I've attached some of the benchmark
results to the ticket.

Can anyone review the patch?

[1]: https://issues.apache.org/jira/browse/IGNITE-11336
[2]: https://github.com/apache/ignite/pull/6116

вт, 20 нояб. 2018 г. в 08:19, Sergi Vladykin <sergi.vladykin@gmail.com>:

> Denis,
>
> See inline.
>
>
> пн, 19 нояб. 2018 г. в 20:17, Denis Magda <dmagda@apache.org>:
>
> > Hi Sergi,
> >
> > Didn't know you were cooking this dish in the background ) Excellent.
> Just
> > to be sure, that's part of this IEP, right?
> >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-20%3A+Data+Compression+in+Ignite#IEP-20:DataCompressioninIgnite-Withoutin-memorycompression
>
>
> Correct.
>
>
> >
> >
> > Since we can release only full file system blocks which are typically 4k
> > > size, user must configure page size to be at least multiple FS blocks,
> > e.g.
> > > 8k or 16k. It also means that max compression ratio here is
> fsBlockSize /
> > > pageSize = 4k / 16k = 0.25
> >
> >
> > How to we handle the case if the page size is not a multiple of 4K? What
> is
> > the most optimal page size if the user wants to get the best compression?
> > Probably, we can adjust the default page size automatically if it's a
> clean
> > deployment.
> >
> >
> We already force page size to be between 1k and 16k and to be power of 2.
> Thus there are only 2 options greater than 4k: either 8k or 16k. So page
> must be just large enough.
>
> Obviously the greater page size, the better compression you have, but
> having very large pages may affect performance badly. Thus 8k with ratio
> 0.5 or 16k with ratio 0.25 must be OK for the most of cases.
>
>
>
> > There will be 2 new properties on CacheConfiguration
> > > (setDiskPageCompression and setDiskPageCompressionLevel) to setup disk
> > page
> > > compression.
> >
> >
> > How about setting it at DataRegionConfiguration level as well so that
> it's
> > applied for all the caches/tables from there?
> >
> >
> Does not seem to make much sense until we can tweak page size for different
> data regions independently (now we can't). I would start with that one
> first.
>
> Sergi
>
>
> --
> > Denis
> >
> > On Mon, Nov 19, 2018 at 2:01 AM Sergi Vladykin <sergi.vladykin@gmail.com
> >
> > wrote:
> >
> > > Folks,
> > >
> > > I've implemented page compression for persistent store and going to
> merge
> > > it to master.
> > >
> > > https://github.com/apache/ignite/pull/5200
> > >
> > > Some design notes:
> > >
> > > It employs "hole punching" approach, it means that the pages are kept
> > > uncompressed in memory,
> > > but when they get written to disk, they will be compressed and all the
> > > extra file system blocks for the page will be released. Thus the
> storage
> > > files become sparse.
> > >
> > > Right now we will support 4 compression methods: ZSTD, LZ4, SNAPPY and
> > > SKIP_GARBAGE. All of them are self-explaining except SKIP_GARBAGE,
> which
> > > basically just takes only meaningful data from half-filled pages but
> does
> > > not apply any compression. It is easy to add more if needed.
> > >
> > > Since we can release only full file system blocks which are typically
> 4k
> > > size, user must configure page size to be at least multiple FS blocks,
> > e.g.
> > > 8k or 16k. It also means that max compression ratio here is
> fsBlockSize /
> > > pageSize = 4k / 16k = 0.25
> > >
> > > It is possible to enable compression for existing databases if they
> were
> > > configured for large enough page size. In this case pages will be
> written
> > > to disk in compressed form when updated, and the database will become
> > > compressed gradually.
> > >
> > > There will be 2 new properties on CacheConfiguration
> > > (setDiskPageCompression and setDiskPageCompressionLevel) to setup disk
> > page
> > > compression.
> > >
> > > Compression dictionaries are not supported at the time, but may in the
> > > future. IMO it should be added as a separate feature if needed.
> > >
> > > The only supported platform for now is Linux. Since all popular file
> > > systems support sparse files, it must be  relatively easy to support
> more
> > > platforms.
> > >
> > > Please take a look and provide your thoughts and suggestions.
> > >
> > > Thanks!
> > >
> > > Sergi
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message