ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavel Pereslegin <xxt...@gmail.com>
Subject Re: IgniteSet implementation: changes required
Date Tue, 25 Sep 2018 17:12:25 GMT
Hello Igniters.

As was discussed, IgniteSet implementation was based on on-heap data
duplication (setDataMap), as a result, the data was not recovered after
cluster restart and in the case of large data sets, this led to a
significant heap growing and gc pressure.

We changed the implementation so that this structure works well without
duplicating the data [1]. To reduce performance drop and speed up large
data sets, non-collocated version of IgniteSet now uses separate cache [2].

[1] https://issues.apache.org/jira/browse/IGNITE-5553
[2] https://issues.apache.org/jira/browse/IGNITE-7823



ср, 27 июн. 2018 г. в 23:26, Amir Akhmedov <amir.akhmedov@gmail.com>:

> Yes, you are right.
>
> Thanks,
> Amir
>
>
> On Wed, Jun 27, 2018 at 1:15 PM Denis Magda <dmagda@apache.org> wrote:
>
> > Got you. If it's about redundant data duplication in onheap region then
> no
> > any concerns from my side.
> >
> > Anyway, considering that the data structure will be interacting with the
> > page memory directly then its entries can be stored in Ignite persistence
> > automatically (if the latter is on). Does it mean that the data structure
> > will be fully recovered after a restart and its entries can be pulled
> from
> > disk on demand?
> >
> > --
> > Denis
> >
> >
> > On Tue, Jun 26, 2018 at 1:49 PM Amir Akhmedov <amir.akhmedov@gmail.com>
> > wrote:
> >
> > > I also think it will better to remove setDataMap support cause
> > > 1. It's making extra pressure on GC by keeping entries on heap
> > > 2. It has difficult logic to support with lots of nuances
> > > 3. To maintain setDataMap today GridCacheMapEntry calls
> > > cctx.dataStructures().onEntryUpdated() on each entry mutation. I think
> > it's
> > > unnecessary cohesion.
> > > 4. For the case with single Ignite cache for all collocated
> > datastructure,
> > > an iterator creation will not be much slower than current
> implementation
> > > since we can run affinity call on the node where all entries reside.
> > Also,
> > > we can create a better affinity mapper to fairly distribute
> > datastructures
> > > across a cluster rather than mapping by datastructure's name.
> > >
> > > Thanks,
> > > Amir
> > >
> > >
> > > On Tue, Jun 26, 2018 at 8:10 AM Anton Vinogradov <av@apache.org>
> wrote:
> > >
> > > > Denis,
> > > >
> > > > I think that better case is to remove onheap
> optimisation/duplication.
> > > > This brings no drop to frequently used operations (put/remove), but
> > even
> > > > will make it slightly faster.
> > > >
> > > > The only one question we have here is "is it possible to restore
> onheap
> > > map
> > > > in easy way?".
> > > > Seems that answer is no, so, I vote for setDataMap removal.
> > > >
> > > > вт, 26 июн. 2018 г. в 15:00, Denis Magda <dmagda@apache.org>:
> > > >
> > > > > Anton,
> > > > >
> > > > > Will it be possible to reuse such a functionality for the rest of
> > data
> > > > > structures? I would invest our time in this if all data structures
> > > would
> > > > be
> > > > > able to work with Ignite persistence this way.
> > > > >
> > > > > --
> > > > > Denis
> > > > >
> > > > > On Tue, Jun 26, 2018 at 1:53 AM Anton Vinogradov <av@apache.org>
> > > wrote:
> > > > >
> > > > > > >> Why don't we read data straight from the persistence
layer
> > warming
> > > > RAM
> > > > > > up
> > > > > > >> in the background?
> > > > > > Because it's not a trivial task to finish such loading on
> unstable
> > > > > > topology.
> > > > > > That's possible, ofcourse, but solution and complexity will
be
> > almost
> > > > > > equals to WAL enable/disable.
> > > > > >
> > > > > > пн, 25 июн. 2018 г. в 22:13, Denis Magda <dmagda@apache.org>:
> > > > > >
> > > > > > > Folks,
> > > > > > >
> > > > > > > Why don't we read data straight from the persistence layer
> > warming
> > > > RAM
> > > > > up
> > > > > > > in the background? (like we do for SQL and other APIs).
If
> it's a
> > > > > > question
> > > > > > > of time, then I would suggest us not to hurry up and do
it in a
> > > right
> > > > > > way.
> > > > > > >
> > > > > > > --
> > > > > > > Denis
> > > > > > >
> > > > > > > On Mon, Jun 25, 2018 at 6:20 AM Anton Vinogradov <
> av@apache.org>
> > > > > wrote:
> > > > > > >
> > > > > > > > +1 to removal in case there is no easy, fast and consistent
> way
> > > to
> > > > > > > restore
> > > > > > > > setDataMap on node restart.
> > > > > > > > I see that we'll gain some performance drop on size()
or
> > keys(),
> > > > but
> > > > > > > these
> > > > > > > > methods are rarely used.
> > > > > > > >
> > > > > > > > пн, 25 июн. 2018 г. в 16:07, Pavel Pereslegin
<
> > xxtern@gmail.com
> > > >:
> > > > > > > >
> > > > > > > > > Hello, Igniters.
> > > > > > > > >
> > > > > > > > > I tried to implement IgniteSet data recovery
when
> persistence
> > > > > enabled
> > > > > > > > > [1] using trivial cache scanning, however I cannot
find
> > optimal
> > > > way
> > > > > > to
> > > > > > > > > do that because of the following reasons:
> > > > > > > > > - Performing operations on IgniteSet requires
completion of
> > > data
> > > > > > > > > loading (restoring of setDataMap) on all nodes.
Do this
> > during
> > > > > > > > > partition map exchange is too long.
> > > > > > > > > - The prohibition of operations on IgniteSet
before the
> > > > completion
> > > > > of
> > > > > > > > > asynchronous cache scanning on all nodes looks
rather
> > > > complicated,
> > > > > > > > > because It is necessary to support all situations
of
> unstable
> > > > > > > > > topology.
> > > > > > > > >
> > > > > > > > > So I see one option to fix data loss on node
restart -
> remove
> > > the
> > > > > > > > > entire optimization (setDataMap) and rework the
iterator
> > > > > > > > > implementation to perform cache scanning.
> > > > > > > > >
> > > > > > > > > Thoughts?
> > > > > > > > >
> > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-5553
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2018-03-17 8:20 GMT+03:00 Andrey Kuznetsov <
> > stkuzma@gmail.com
> > > >:
> > > > > > > > > > Thanks, Dmitry. I agree ultimately, DS API
uniformity is
> a
> > > > > weighty
> > > > > > > > > reason.
> > > > > > > > > >
> > > > > > > > > > 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan
<
> > > > > dsetrakyan@apache.org
> > > > > > >:
> > > > > > > > > >
> > > > > > > > > >> On Fri, Mar 16, 2018 at 7:39 AM, Andrey
Kuznetsov <
> > > > > > > stkuzma@gmail.com>
> > > > > > > > > >> wrote:
> > > > > > > > > >>
> > > > > > > > > >> > Dmitry, your way allows to reuse
existing
> > {{Ignite.set()}}
> > > > API
> > > > > > to
> > > > > > > > > create
> > > > > > > > > >> > both set flavors. We can adopt
it unless somebody in
> the
> > > > > > community
> > > > > > > > > >> objects.
> > > > > > > > > >> > Personally, I like {{IgniteCache.asSet()}}
approach
> > > proposed
> > > > > by
> > > > > > > > > Vladimir
> > > > > > > > > >> O.
> > > > > > > > > >> > more, since it emphasizes the difference
between sets
> > > being
> > > > > > > created,
> > > > > > > > > but
> > > > > > > > > >> > this will require API extension.
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > > >> Andrey, I am suggesting that Ignite.set(...)
in
> > > non-collocated
> > > > > > mode
> > > > > > > > > behaves
> > > > > > > > > >> exactly the same as the proposed IgniteCache.asSet()
> > > method. I
> > > > > do
> > > > > > > not
> > > > > > > > > like
> > > > > > > > > >> the IgniteCache.asSet() API because
it is inconsistent
> > with
> > > > > Ignite
> > > > > > > > data
> > > > > > > > > >> structure design. All data structures
are provided on
> > Ignite
> > > > API
> > > > > > > > > directly
> > > > > > > > > >> and we should not change that.
> > > > > > > > > >>
> > > > > > > > > >> D.
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Best regards,
> > > > > > > > > >   Andrey Kuznetsov.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message