ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Ozerov <voze...@gridgain.com>
Subject Re: Logical Cache Documented
Date Wed, 04 Oct 2017 05:39:42 GMT
I do not think that bigger B+Tree matter much. I was talking only about
data blocks. When you have a lot of logical caches, all of them are mixed
in the same data blocks. As a result you typically have to perform more IO
operations to read the same amount of data, as data block content becomes
more "chaotic".

Currently all scans go through primary index.

On Wed, Oct 4, 2017 at 12:24 AM, Denis Magda <dmagda@apache.org> wrote:

> Vladimir,
>
> Thanks for the explanation and see inline
>
> > On Oct 3, 2017, at 12:57 PM, Vladimir Ozerov <vozerov@gridgain.com>
> wrote:
> >
> > Denis,
> >
> > This is not a "must have", neither I can name it a "feature". We have
> > internal partition state metadata. When there is a lot of caches, there
> is
> > a lot of metadata. It consumes local Java heap, causes high network
> traffic
> > on rebalance, and require Ignite to create a lot of files when
> persistence
> > is enabled, what slows down checkpoints. All these problems could be
> > resolved by better storage architecture and "joining" of partition maps
> of
> > caches with same affinity functions in runtime.
> >
> > But this is difficult, so we created "cache groups" as a kind of
> shortcut.
> > It saves heap, saves network, and reduces number of files. But it comes
> at
> > a cost - now single data page contain data from different caches. This
> > causes higher than usual miss rate (and as a result more OS calls) for
> > random cache operations and index lookups.
>
> Do you mean longer traverse of the b+tree under the "higher miss rate”?
> Has anybody measured the impact? Personally, for me log(n1) is not that
> different from log(n1 + n2 + n3) unless n is a big coefficient.
>
>
> > In future it will also cause
> > poor compression rates when compression is implemented, and it will cause
> > poor scan performance when efficient scans are implemented.
> >
>
> How do we scan grouped caches presently? Simply filtering out the entries
> not belonging to a cache of interest?
>
> > To summarize, we *SHOULD NOT* advise users to use this feature unless
> they
> > have problems with high heap usage due to partition maps, or poor
> > chekpointing performance due to excessive fsyncs.
> >
>
> Ivan R., Alex G., could you comment on the checkpointing performance? I
> don’t get why a number of opened files affects it. What should matter is
> the frequency of fsync, shouldn’t it? If we have fewer files then the
> frequency will soar since every cache writes into a single destination.
>
> Vladimir, what’s about long joining process and rebalancing kick-off on
> node failure? I heard an amount of partition maps influences on this and
> put this on paper.
>
> —
> Denis
>
> > On Tue, Oct 3, 2017 at 10:48 PM, Denis Magda <dmagda@apache.org> wrote:
> >
> >> Vladimir,
> >>
> >> Please share more details that I can put on the paper. Presently the
> >> feature is described as a must have and I struggled finding any negative
> >> impact related info.
> >>
> >> —
> >> Denis
> >>
> >>> On Oct 3, 2017, at 12:46 PM, Vladimir Ozerov <vozerov@gridgain.com>
> >> wrote:
> >>>
> >>> Denis,
> >>>
> >>> This feature should not be enabled by default as it negatively affects
> >> read
> >>> performance.
> >>>
> >>> On Tue, Oct 3, 2017 at 10:31 PM, Denis Magda <dmagda@apache.org>
> wrote:
> >>>
> >>>> Sam,
> >>>>
> >>>> Is there any technical limitation that prevents us from assigning
> caches
> >>>> with similar parameters to relevant groups on-the-fly?
> >>>>
> >>>> After finishing the doc, I’m convinced the feature should be enabled
> by
> >>>> default unless there are some pitfalls not known by me.
> >>>>
> >>>> BTW, decided to avoid logical caches term usage falling back to vivid
> >>>> cache groups notion:
> >>>> https://apacheignite.readme.io/docs/cache-groups <
> >>>> https://apacheignite.readme.io/docs/cache-groups>
> >>>>
> >>>> —
> >>>> Denis
> >>>>
> >>>>> On Oct 3, 2017, at 12:10 AM, Semyon Boikov <sboikov@gridgain.com>
> >> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> Regarding question about  default cache group: by default cache
> groups
> >>>> are
> >>>>> not enabled, each cache is started in separate group. Cache group
is
> >>>>> enabled only if groupName is set in CacheConfiguration.
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>> On Sat, Sep 30, 2017 at 11:55 PM, <dsetrakyan@apache.org>
wrote:
> >>>>>
> >>>>>> Why not? Obviously compression would have to be enabled per
group,
> not
> >>>> per
> >>>>>> cache.
> >>>>>>
> >>>>>> ⁣D.​
> >>>>>>
> >>>>>> On Sep 29, 2017, 10:50 PM, at 10:50 PM, Vladimir Ozerov <
> >>>>>> vozerov@gridgain.com> wrote:
> >>>>>>> And it will continue hitting us in future. For example,
when data
> >>>>>>> compression is implemented, for logical caches compression
rate
> will
> >> be
> >>>>>>> poor, as it would be impossbile to build efficient dictionaries
in
> >>>>>>> mixed
> >>>>>>> data pages.
> >>>>>>>
> >>>>>>> On Sat, Sep 30, 2017 at 8:48 AM, Vladimir Ozerov <
> >> vozerov@gridgain.com
> >>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Folks,
> >>>>>>>>
> >>>>>>>> Honesly, to me logical caches appears to be a dirty
shortcut to
> >>>>>>> mitigate
> >>>>>>>> some inefficient internal implementation. Why can't
we merge
> >>>>>>> partition maps
> >>>>>>>> in runtime? This should not be a problem for context-independent
> >>>>>>> affinity
> >>>>>>>> functions (e.g. RendezvousAffinityFunction). From user
perspective
> >>>>>>> logic
> >>>>>>>> caches feature is:
> >>>>>>>> 1) Bad API. One cannot define group configuration. All
you can do
> is
> >>>>>>> to
> >>>>>>>> define group name on cache lavel and hope that nobody
started
> >> another
> >>>>>>> cache
> >>>>>>>> in the same group with different configuration before.
> >>>>>>>> 2) Performance impact for scans, as you have to iterate
over mixed
> >>>>>>> data.
> >>>>>>>>
> >>>>>>>> Couldn't we fix partition map problem without cache
groups?
> >>>>>>>>
> >>>>>>>> On Sat, Sep 30, 2017 at 2:35 AM, Denis Magda <dmagda@apache.org>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Guys,
> >>>>>>>>>
> >>>>>>>>> Another question. Does this capability enabled by
default? If
> yes,
> >>>>>>> how do
> >>>>>>>>> we decide which group a cache goes to?
> >>>>>>>>>
> >>>>>>>>> —
> >>>>>>>>> Denis
> >>>>>>>>>
> >>>>>>>>>> On Sep 29, 2017, at 3:58 PM, Denis Magda <dmagda@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Igniters,
> >>>>>>>>>>
> >>>>>>>>>> I’ve put on paper the feature from the subj:
> >>>>>>>>>> https://apacheignite.readme.io/docs/logical-caches
<
> >>>>>>>>> https://apacheignite.readme.io/docs/logical-caches>
> >>>>>>>>>>
> >>>>>>>>>> Sam, will appreciate if you read through it
and confirm I
> >>>>>>> explained the
> >>>>>>>>> topic 100% technically correct.
> >>>>>>>>>>
> >>>>>>>>>> However, are there any negative impacts of having
logical
> caches?
> >>>>>>> This
> >>>>>>>>> page has “Possible Impacts” section unfilled:
> >>>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/
> Logical+Caches
> >>>>>>> <
> >>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/
> Logical+Caches>
> >>>>>>>>>>
> >>>>>>>>>> —
> >>>>>>>>>> Denis
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message