ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Pavlov <dpavlov....@gmail.com>
Subject Re: Data compression design proposal
Date Mon, 26 Mar 2018 16:06:30 GMT
+1 to Alexey's concern. No reason to compress if we use previous offset as
pageIdx*pageSize.

пн, 26 мар. 2018 г. в 18:59, Alexey Goncharuk <alexey.goncharuk@gmail.com>:

> Guys,
>
> How does this fit the PageMemory concept? Currently it assumes that the
> size of the page in memory and the size of the page on disk is the same, so
> only per-entry level compression within a page makes sense.
>
> If you compress a whole page, how do you calculate the page offset in the
> target data file?
>
> --AG
>
> 2018-03-26 17:39 GMT+03:00 Vladimir Ozerov <vozerov@gridgain.com>:
>
> > Gents,
> >
> > If I understood the idea correctly, the proposal is to compress pages on
> > eviction and decompress them on read from disk. Is it correct?
> >
> > On Mon, Mar 26, 2018 at 5:13 PM, Anton Vinogradov <av@apache.org> wrote:
> >
> > > + 1 to Taras's vision.
> > >
> > > Compression on eviction is a good case to store more.
> > > Pages at memory always hot a real system, so complession in memory will
> > > definetely slowdown the system, I think.
> > >
> > > Anyway, we can split issue to "on eviction compression" and to
> "in-memory
> > > compression".
> > >
> > >
> > > 2018-03-06 12:14 GMT+03:00 Taras Ledkov <tledkov@gridgain.com>:
> > >
> > > > Hi,
> > > >
> > > > I guess page level compression make sense on page loading / eviction.
> > > > In this case we can decrease I/O operation and performance boost can
> be
> > > > reached.
> > > > What is goal for in-memory compression? Holds about 2-5x data in
> memory
> > > > with performance drop?
> > > >
> > > > Also please clarify the case with compression/decompression for hot
> and
> > > > cold pages.
> > > > Is it right for your approach:
> > > > 1. Hot pages are always decompressed in memory because many
> read/write
> > > > operations touch ones.
> > > > 2. So we can compress only cold pages.
> > > >
> > > > So the way is suitable when the hot data size << available RAM size.
> > > >
> > > > Thoughts?
> > > >
> > > >
> > > > On 05.03.2018 20:18, Vyacheslav Daradur wrote:
> > > >
> > > >> Hi Igniters!
> > > >>
> > > >> I’d like to do next step in our data compression discussion [1].
> > > >>
> > > >> Most Igniters vote for per-data-page compression.
> > > >>
> > > >> I’d like to accumulate  main theses to start implementation:
> > > >> - page will be compressed with the dictionary-based approach
> (e.g.LZV)
> > > >> - page will be compressed in batch mode (not on every change)
> > > >> - page compression should been initiated by an event, for example,
a
> > > >> page’s free space drops below 20%
> > > >> - compression process will be under page write lock
> > > >>
> > > >> Vladimir Ozerov has written:
> > > >>
> > > >>> What we do not understand yet:
> > > >>>> 1) Granularity of compression algorithm.
> > > >>>> 1.1) It could be per-entry - i.e. we compress the whole entry
> > content,
> > > >>>> but
> > > >>>> respect boundaries between entries. E.g.: before -
> > [ENTRY_1][ENTRY_2],
> > > >>>> after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed
to
> > > >>>> [COMPRESSED ENTRY_1 and ENTRY_2]).
> > > >>>> v1.2) Or it could be per-field - i.e. we compress fields,
but
> > respect
> > > >>>> binary
> > > >>>> object layout. First approach is simple, straightforward,
and will
> > > give
> > > >>>> acceptable compression rate, but we will have to compress
the
> whole
> > > >>>> binary
> > > >>>> object on every field access, what may ruin our SQL performance.
> > > Second
> > > >>>> approach is more complex, we are not sure about it's compression
> > rate,
> > > >>>> but
> > > >>>> as BinaryObject structure is preserved, we will still have
fast
> > > >>>> constant-time per-field access.
> > > >>>>
> > > >>> I think there are advantages in both approaches and we will be
able
> > to
> > > >> compare different approaches and algorithms after prototype
> > > >> implementation.
> > > >>
> > > >> Main approach in brief:
> > > >> 1) When page’s free space drops below 20% will be triggered
> > compression
> > > >> event
> > > >> 2) Page will be locked by write lock
> > > >> 3) Page will be passed to page’s compressor implementation
> > > >> 4) Page will be replaced by compressed page
> > > >>
> > > >> Whole object or a field reading:
> > > >> 1) If page marked as compressed then the page will be handled by
> > > >> page’s compressor implementation, otherwise, it will be handled
as
> > > >> usual.
> > > >>
> > > >> Thoughts?
> > > >>
> > > >> Should we create new IEP and register tickets to start
> implementation?
> > > >> This will allow us to watch for the feature progress and related
> > > >> tasks.
> > > >>
> > > >>
> > > >> [1] http://apache-ignite-developers.2346864.n4.nabble.com/Data-
> > > >> compression-in-Ignite-tc20679.html
> > > >>
> > > >>
> > > >>
> > > > --
> > > > Taras Ledkov
> > > > Mail-To: tledkov@gridgain.com
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message