ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Setrakyan <dsetrak...@apache.org>
Subject Re: Prohibit stateful affinity (FairAffinityFunction)
Date Tue, 11 Apr 2017 16:51:58 GMT
Yakov,

I would like to understand the percentage of redundant partition moves in
the new implementation, whenever the topology changes. For example, if you
have Node1 and Node2, and then Node3 is added, then the only *good*
partition traffic is from Node1 -> Node3 and from Node2 -> Node3. However,
the traffic between Node1 <-> Node2 is generally redundant and useless. Do
we know the percentage of the redundant partition migration?

Without getting a clear picture here, we should not be making a decision
one way or another.

D.

On Tue, Apr 11, 2017 at 1:56 AM, Yakov Zhdanov <yzhdanov@apache.org> wrote:

> Guys, after some thoughts I would say that even distribution is more
> important for affinity function than traffic on rebalancing (which should
> be kept to minimum also). Even distribution gives even load on stable
> topology, while rebalancing is somet disaster. Apparently, grids should
> spend more time in stable state than in failure recovery. And rebalancing
> can be configured to cause as less impact to the system as possible.
>
> Dmitry, M. Griggs fixed keys distribution over partitions, but not
> partitions over nodes. This change is in ignite-4828, I reviewed it and
> will merge it today.
>
> Taras, your numbers are very suspicious also - do you really have 26
> partitions migrated on 64 nodes topology when 1 node leaves? I will review
> your changes one more time and provide comments here.
>
> --Yakov
>
> 2017-04-10 18:12 GMT+03:00 Taras Ledkov <tledkov@gridgain.com>:
>
> > I updated the issue [1] with the table of the average count of migrated
> > primary partitions when one node leaves.
> >
> > [1]. https://issues.apache.org/jira/browse/IGNITE-3018?focusedCom
> > mentId=15963015&page=com.atlassian.jira.plugin.system.
> > issuetabpanels:comment-tabpanel#comment-15963015
> >
> >
> >
> > On 10.04.2017 18:00, Sergi Vladykin wrote:
> >
> >> Absolutely agree, lets get some numbers on RendezvousAffinity with both
> >> variants: useBalancer enabled and disabled. Taras, can you provide them?
> >>
> >> Anyways at the moment we need to make a decision on what will get into
> >> 2.0.
> >> I'm for dropping (or hiding) all the suspicious stuff and adding it back
> >> if
> >> we fix it. Thus I'm going to move FairAffinity into private package now.
> >>
> >> Sergi
> >>
> >> 2017-04-10 16:55 GMT+03:00 Vladimir Ozerov <vozerov@gridgain.com>:
> >>
> >> Sergi,
> >>>
> >>> AFAIK the only reason why RendezvousAffinity is used by default is that
> >>> behavior on rebalance is no less important than steady state
> performance,
> >>> especially on large deployments and cloud environments, when nodes
> >>> constantly joins and leaves topology. Let's stop guessing and discuss
> the
> >>> numbers - how many partitions reassignments happen with new
> >>> RendezvousAffinity flavor? I haven't seen any results so far.
> >>>
> >>> On Mon, Apr 10, 2017 at 4:39 PM, Andrey Gura <agura@apache.org> wrote:
> >>>
> >>> Guys,
> >>>>
> >>>> It seems that both mentioned problem have the same root cause: each
> >>>> cache has personal affinity function instance and it leads to
> >>>> perfromance problem (we retry the same calcualtions for each cache)
> >>>> and problem related with fact that FailAffinityFunction is statefull
> >>>> (some co-located cache has different assignment if it was started on
> >>>> different topology).
> >>>>
> >>>> Obvious solution is the same affinity for some cache set. As result
> >>>> all caches from one set will use the same assignment that will be
> >>>> calculated exactly once and will not depend on cache start topology.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Mon, Apr 10, 2017 at 4:05 PM, Sergi Vladykin
> >>>> <sergi.vladykin@gmail.com> wrote:
> >>>>
> >>>>> As for default value for useBalancer flag, I agree with Yakov, it
> must
> >>>>>
> >>>> be
> >>>
> >>>> enabled by default. Because performance in steady state is usually
> more
> >>>>> important than performance of rebalancing. For edge cases it can
be
> >>>>> disabled.
> >>>>>
> >>>>> Sergi
> >>>>>
> >>>>> 2017-04-10 15:04 GMT+03:00 Sergi Vladykin <sergi.vladykin@gmail.com
> >:
> >>>>>
> >>>>> If the RendezvousAffinity with enabled useBalancer is not much worse
> >>>>>>
> >>>>> than
> >>>>
> >>>>> FairAffinity, I see no reason to keep the latter.
> >>>>>>
> >>>>>> Sergi
> >>>>>>
> >>>>>> 2017-04-10 13:00 GMT+03:00 Vladimir Ozerov <vozerov@gridgain.com>:
> >>>>>>
> >>>>>> Guys,
> >>>>>>>
> >>>>>>> We should not have it enabled by default because as Taras
> mentioned:
> >>>>>>>
> >>>>>> "but
> >>>>
> >>>>> in this case there is not guarantee that a partition doesn't move
> >>>>>>>
> >>>>>> from
> >>>
> >>>> one
> >>>>
> >>>>> node to another when node leave topology". Let's avoid any rush
here.
> >>>>>>> There
> >>>>>>> is nothing terribly wrong with FairAffinity. It is not enabled
by
> >>>>>>>
> >>>>>> default
> >>>>
> >>>>> and at the very least we can always mark it as deprecated. It is
> >>>>>>>
> >>>>>> better to
> >>>>
> >>>>> test rigorously rendezvous affinity first in terms of partition
> >>>>>>> distribution and partition migration and decide whether
results are
> >>>>>>> acceptable.
> >>>>>>>
> >>>>>>> On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov <
> yzhdanov@apache.org
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> We should have it enabled by default.
> >>>>>>>>
> >>>>>>>> --Yakov
> >>>>>>>>
> >>>>>>>> 2017-04-10 12:42 GMT+03:00 Sergi Vladykin <
> >>>>>>>>
> >>>>>>> sergi.vladykin@gmail.com
> >>>
> >>>> :
> >>>>>
> >>>>>> Why wouldn't we have useBalancer always enabled?
> >>>>>>>>>
> >>>>>>>>> Sergi
> >>>>>>>>>
> >>>>>>>>> 2017-04-10 12:31 GMT+03:00 Taras Ledkov <tledkov@gridgain.com>:
> >>>>>>>>>
> >>>>>>>>> Folks,
> >>>>>>>>>>
> >>>>>>>>>> I worked on issue https://issues.apache.org/
> >>>>>>>>>>
> >>>>>>>>> jira/browse/IGNITE-3018
> >>>>
> >>>>> that
> >>>>>>>>
> >>>>>>>>> is related to performance of Rendezvous AF.
> >>>>>>>>>>
> >>>>>>>>>> But Wang/Jenkins hash integer hash distribution
is worse then
> >>>>>>>>>>
> >>>>>>>>> MD5.
> >>>>
> >>>>> So,
> >>>>>>>
> >>>>>>>> i
> >>>>>>>>
> >>>>>>>>> try to use simple partition balancer close
> >>>>>>>>>> to Fair AF for Rendezvous AF.
> >>>>>>>>>>
> >>>>>>>>>> Take a look at the heatmaps of distributions
at the issue.
> >>>>>>>>>>
> >>>>>>>>> e.g.:
> >>>
> >>>> - Compare of current Rendezvous AF and new Rendezvous AF based
> >>>>>>>>>>
> >>>>>>>>> of
> >>>
> >>>> Wang/Jenkins hash: https://issues.apache.org/jira
> >>>>>>>>>> /secure/attachment/12858701/004.png
> >>>>>>>>>> - Compare of current Rendezvous AF and new Rendezvous
AF based
> >>>>>>>>>>
> >>>>>>>>> of
> >>>
> >>>> Wang/Jenkins hash with partition balancer:
> >>>>>>>>>>
> >>>>>>>>> https://issues.apache.org/jira
> >>>>>>>>>
> >>>>>>>>>> /secure/attachment/12858690/balanced.004.png
> >>>>>>>>>>
> >>>>>>>>>> When the balancer is enabled the distribution
of partitions by
> >>>>>>>>>>
> >>>>>>>>> nodes
> >>>>
> >>>>> looks
> >>>>>>>>>
> >>>>>>>>>> like close to even distribution
> >>>>>>>>>> but in this case there is not guarantee that
a partition
> >>>>>>>>>>
> >>>>>>>>> doesn't
> >>>
> >>>> move
> >>>>>>>
> >>>>>>>> from
> >>>>>>>>>
> >>>>>>>>>> one node to another
> >>>>>>>>>> when node leave topology.
> >>>>>>>>>> It is not guarantee but we try to minimize it
because sorted
> >>>>>>>>>>
> >>>>>>>>> array
> >>>>
> >>>>> of
> >>>>>>>
> >>>>>>>> nodes is used (like in for pure-Rendezvous AF).
> >>>>>>>>>>
> >>>>>>>>>> I think we can use new fast Rendezvous AF and
use 'useBalancer'
> >>>>>>>>>>
> >>>>>>>>> flag
> >>>>
> >>>>> instead of Fair AF.
> >>>>>>>>>>
> >>>>>>>>>> On 09.04.2017 14:12, Valentin Kulichenko wrote:
> >>>>>>>>>>
> >>>>>>>>>> What is the replacement for FairAffinityFunction?
> >>>>>>>>>>>
> >>>>>>>>>>> Generally I agree. If FairAffinityFunction
can't be changed to
> >>>>>>>>>>>
> >>>>>>>>>> provide
> >>>>>>>
> >>>>>>>> consistent mapping, it should be dropped.
> >>>>>>>>>>>
> >>>>>>>>>>> -Val
> >>>>>>>>>>>
> >>>>>>>>>>> On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin
<
> >>>>>>>>>>>
> >>>>>>>>>> sergi.vladykin@gmail.com
> >>>>>>>>>
> >>>>>>>>>> <mailto:sergi.vladykin@gmail.com>>
wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>      Guys,
> >>>>>>>>>>>
> >>>>>>>>>>>      It appeared that our FairAffinityFunction
can assign the
> >>>>>>>>>>>
> >>>>>>>>>> same
> >>>>
> >>>>>      partitions to
> >>>>>>>>>>>      different nodes for different caches.
> >>>>>>>>>>>
> >>>>>>>>>>>      It basically means that there is no
collocation between
> >>>>>>>>>>>
> >>>>>>>>>> the
> >>>
> >>>> caches
> >>>>>>>
> >>>>>>>>      at all
> >>>>>>>>>>>      even if they have the same affinity.
> >>>>>>>>>>>
> >>>>>>>>>>>      As a result all SQL joins will not
work (even collocated
> >>>>>>>>>>>
> >>>>>>>>>> ones),
> >>>>
> >>>>> other
> >>>>>>>>>
> >>>>>>>>>>      operations that rely on cache collocation
will be either
> >>>>>>>>>>>
> >>>>>>>>>> broken or
> >>>>>>>
> >>>>>>>>      work
> >>>>>>>>>>>      slower, than expected.
> >>>>>>>>>>>
> >>>>>>>>>>>      All this stuff is really non-obvious.
And I see no reason
> >>>>>>>>>>>
> >>>>>>>>>> why
> >>>>
> >>>>> we
> >>>>>>>
> >>>>>>>>      should
> >>>>>>>>>>>      allow that. I suggest to prohibit this
behavior and drop
> >>>>>>>>>>>      FairAffinityFunction before 2.0. We
have to clearly
> >>>>>>>>>>>
> >>>>>>>>>> document
> >>>
> >>>> that
> >>>>>>>
> >>>>>>>>      the same
> >>>>>>>>>>>      affinity function must provide the
same partition
> >>>>>>>>>>>
> >>>>>>>>>> assignments
> >>>>
> >>>>> for
> >>>>>>>
> >>>>>>>>      all the
> >>>>>>>>>>>      caches.
> >>>>>>>>>>>
> >>>>>>>>>>>      Also I know that Taras Ledkov was working
on a decent
> >>>>>>>>>>>
> >>>>>>>>>> stateless
> >>>>
> >>>>>      replacement
> >>>>>>>>>>>      for FairAffinity, so we should not
loose anything here.
> >>>>>>>>>>>
> >>>>>>>>>>>      Thoughts?
> >>>>>>>>>>>
> >>>>>>>>>>>      Sergi
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>> Taras Ledkov
> >>>>>>>>>> Mail-To: tledkov@gridgain.com
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>
> > --
> > Taras Ledkov
> > Mail-To: tledkov@gridgain.com
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message