ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergi Vladykin <sergi.vlady...@gmail.com>
Subject Re: Disk page compression for Ignite persistent store
Date Tue, 20 Nov 2018 05:18:20 GMT
Denis,

See inline.


пн, 19 нояб. 2018 г. в 20:17, Denis Magda <dmagda@apache.org>:

> Hi Sergi,
>
> Didn't know you were cooking this dish in the background ) Excellent.  Just
> to be sure, that's part of this IEP, right?
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-20%3A+Data+Compression+in+Ignite#IEP-20:DataCompressioninIgnite-Withoutin-memorycompression


Correct.


>
>
> Since we can release only full file system blocks which are typically 4k
> > size, user must configure page size to be at least multiple FS blocks,
> e.g.
> > 8k or 16k. It also means that max compression ratio here is fsBlockSize /
> > pageSize = 4k / 16k = 0.25
>
>
> How to we handle the case if the page size is not a multiple of 4K? What is
> the most optimal page size if the user wants to get the best compression?
> Probably, we can adjust the default page size automatically if it's a clean
> deployment.
>
>
We already force page size to be between 1k and 16k and to be power of 2.
Thus there are only 2 options greater than 4k: either 8k or 16k. So page
must be just large enough.

Obviously the greater page size, the better compression you have, but
having very large pages may affect performance badly. Thus 8k with ratio
0.5 or 16k with ratio 0.25 must be OK for the most of cases.



> There will be 2 new properties on CacheConfiguration
> > (setDiskPageCompression and setDiskPageCompressionLevel) to setup disk
> page
> > compression.
>
>
> How about setting it at DataRegionConfiguration level as well so that it's
> applied for all the caches/tables from there?
>
>
Does not seem to make much sense until we can tweak page size for different
data regions independently (now we can't). I would start with that one
first.

Sergi


--
> Denis
>
> On Mon, Nov 19, 2018 at 2:01 AM Sergi Vladykin <sergi.vladykin@gmail.com>
> wrote:
>
> > Folks,
> >
> > I've implemented page compression for persistent store and going to merge
> > it to master.
> >
> > https://github.com/apache/ignite/pull/5200
> >
> > Some design notes:
> >
> > It employs "hole punching" approach, it means that the pages are kept
> > uncompressed in memory,
> > but when they get written to disk, they will be compressed and all the
> > extra file system blocks for the page will be released. Thus the storage
> > files become sparse.
> >
> > Right now we will support 4 compression methods: ZSTD, LZ4, SNAPPY and
> > SKIP_GARBAGE. All of them are self-explaining except SKIP_GARBAGE, which
> > basically just takes only meaningful data from half-filled pages but does
> > not apply any compression. It is easy to add more if needed.
> >
> > Since we can release only full file system blocks which are typically 4k
> > size, user must configure page size to be at least multiple FS blocks,
> e.g.
> > 8k or 16k. It also means that max compression ratio here is fsBlockSize /
> > pageSize = 4k / 16k = 0.25
> >
> > It is possible to enable compression for existing databases if they were
> > configured for large enough page size. In this case pages will be written
> > to disk in compressed form when updated, and the database will become
> > compressed gradually.
> >
> > There will be 2 new properties on CacheConfiguration
> > (setDiskPageCompression and setDiskPageCompressionLevel) to setup disk
> page
> > compression.
> >
> > Compression dictionaries are not supported at the time, but may in the
> > future. IMO it should be added as a separate feature if needed.
> >
> > The only supported platform for now is Linux. Since all popular file
> > systems support sparse files, it must be  relatively easy to support more
> > platforms.
> >
> > Please take a look and provide your thoughts and suggestions.
> >
> > Thanks!
> >
> > Sergi
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message