ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Ozerov <voze...@gridgain.com>
Subject Re: Resurrect FairAffinityFunction
Date Mon, 28 Aug 2017 12:26:18 GMT
Igniters,

I created a ticket for SQL exception in case of invalid combination of
affinity functions are used in query with JOINs:
https://issues.apache.org/jira/browse/IGNITE-6195

On Wed, Aug 16, 2017 at 11:59 AM, Vladimir Ozerov <vozerov@gridgain.com>
wrote:

> I am still not quite understand why we bother so much about partition
> migration while we already have RendezvousAffinityFunction? Looks like we
> tried to make "fair" be closer to "rendezvous", and then tried to make
> "rendezvous" closer to "fair", but without any success:
> - we are still not satisfied with distribution of "rendezvous", so we want
> to return "fair" back
> - and "fair" cannot co-locate caches properly, so things like SQL and
> "affinityCall" are broken for it
>
> Can we add a flag to FairAffinityFunction to control whether it should
> previous history or not? Is yes - it is not deterministic, if no - it is
> deterministic.
>
>
> On Wed, Aug 16, 2017 at 11:29 AM, Vladimir Ozerov <vozerov@gridgain.com>
> wrote:
>
>> I already explained why checking specifically for FairAffintiyFunction in
>> SQL is wrong.
>>
>> On Wed, Aug 16, 2017 at 10:00 AM, Dmitriy Setrakyan <
>> dsetrakyan@apache.org> wrote:
>>
>>> Vladimir, I think you should look more carefully at what FairAffinity
>>> does
>>> and why it exists. There is no way to "fix" it not to accept previous
>>> state
>>> - it must be provided.
>>>
>>> I do not understand why we need to change anything in code when there is
>>> a
>>> simple solution being proposed here. Why not just add a validation to SQL
>>> when fair affinity is used with JOINs and be done with it?
>>>
>>> Alexey G, given that you originally wrote the FairAffinity function, can
>>> you please comment?
>>>
>>> D.
>>>
>>> On Tue, Aug 15, 2017 at 10:41 PM, Vladimir Ozerov <vozerov@gridgain.com>
>>> wrote:
>>>
>>> > Dima,
>>> >
>>> > It is not hard to implement. It is hard to reason on whether your query
>>> > will fail or not. Moreover, cache groups is an antipattern for SQL,
>>> > personally I do not want users to use it unless absolutely needed
>>> (large
>>> > topologies, large number of caches). Also take in count that the same
>>> > problem with different partition distribution holds for any two caches
>>> with
>>> > different affinity functions, so the problem is not tied to
>>> > FairAffinityFunction only.
>>> >
>>> > IMO correct fix should be as follows:
>>> > 1) Add requirement that every affinity function must provide sensible
>>> > implementation of equals/hashCode
>>> > 2) Add "boolean deterministic()" method to affinity function interface;
>>> > "true" means that function doesn't depend on any external things, such
>>> as
>>> > topology history
>>> > 3) Fail SQL if there are at least two PARTITIONED caches with
>>> different or
>>> > non-deterministic affinity functions, and distributed joins are not
>>> enabled
>>> > 4) Fix FairAffinityFunction and make it deterministic, it should not
>>> depend
>>> > on previous state.
>>> >
>>> > Makes sense?
>>> >
>>> > On Wed, Aug 16, 2017 at 3:31 AM, Dmitriy Setrakyan <
>>> dsetrakyan@apache.org>
>>> > wrote:
>>> >
>>> > > On Tue, Aug 15, 2017 at 1:12 PM, Vladimir Ozerov <
>>> vozerov@gridgain.com>
>>> > > wrote:
>>> > >
>>> > > > I do not like the idea as it would make it very hard to reason
>>> about
>>> > > > whether your SQL will fail or not. Let's looks at the problem
from
>>> the
>>> > > > different angle. I have this question for years - why in the world
>>> > *fair*
>>> > > > affinity function, whose only ultimate goal is to provide equal
>>> > partition
>>> > > > distribution, depends on it's own previous state? Can we re-design
>>> in a
>>> > > way
>>> > > > that it depends only on partition count and current topology state?
>>> > > >
>>> > >
>>> > > Vladimir, we must know previous state, otherwise the data partitions
>>> will
>>> > > be randomly moving across the network every time a topology changes.
>>> > >
>>> > > From the SQL standpoint, you can just fail all queries that have a
>>> JOIN
>>> > > from different cache groups, if at least one of the groups is using
>>> Fair
>>> > > Affinity. I am not sure why this would be hard.
>>> > >
>>> > >
>>> > > >
>>> > > > On Thu, Aug 10, 2017 at 12:16 AM, Valentin Kulichenko <
>>> > > > valentin.kulichenko@gmail.com> wrote:
>>> > > >
>>> > > > > As far as I know, all logical caches with the same affinity
>>> function
>>> > > and
>>> > > > > node filter will end up in the same group. If that's the
case, I
>>> like
>>> > > the
>>> > > > > idea. This is exactly what I was looking for.
>>> > > > >
>>> > > > > -Val
>>> > > > >
>>> > > > > On Wed, Aug 9, 2017 at 8:18 AM, Evgenii Zhuravlev <
>>> > > > > e.zhuravlev.wk@gmail.com>
>>> > > > > wrote:
>>> > > > >
>>> > > > > > Dmitriy,
>>> > > > > >
>>> > > > > > Yes, you're right. Moreover, it looks like a good practice
to
>>> > combine
>>> > > > > > caches that will be used for collocated JOINs in one
group
>>> since it
>>> > > > > reduces
>>> > > > > > overall overhead.
>>> > > > > >
>>> > > > > > I think it's not a problem to add this restriction to
the SQL
>>> JOIN
>>> > > > level
>>> > > > > if
>>> > > > > > we will decide to use this solution.
>>> > > > > >
>>> > > > > > Evgenii
>>> > > > > >
>>> > > > > >
>>> > > > > >
>>> > > > > >
>>> > > > > > 2017-08-09 17:07 GMT+03:00 Dmitriy Setrakyan <
>>> > dsetrakyan@apache.org
>>> > > >:
>>> > > > > >
>>> > > > > > > On Wed, Aug 9, 2017 at 6:28 AM, ezhuravl <
>>> > e.zhuravlev.wk@gmail.com
>>> > > >
>>> > > > > > wrote:
>>> > > > > > >
>>> > > > > > > > Folks,
>>> > > > > > > >
>>> > > > > > > > I've started working on a https://issues.apache.org/
>>> > > > > > > > jira/browse/IGNITE-5836
>>> > > > > > > > ticket and found that the recently added feature
with
>>> > cacheGroups
>>> > > > > doing
>>> > > > > > > > pretty much the same that was described in
this issue.
>>> > CacheGroup
>>> > > > > > > > guarantees
>>> > > > > > > > that all caches within a group have same assignments
since
>>> they
>>> > > > > share a
>>> > > > > > > > single underlying 'physical' cache.
>>> > > > > > > >
>>> > > > > > >
>>> > > > > > > > I think we can return FairAffinityFunction
and add
>>> information
>>> > to
>>> > > > its
>>> > > > > > > > Javadoc that all caches with same AffinityFunction
and
>>> > NodeFilter
>>> > > > > > should
>>> > > > > > > be
>>> > > > > > > > combined in cache group to avoid a problem
with
>>> inconsistent
>>> > > > previous
>>> > > > > > > > assignments.
>>> > > > > > > >
>>> > > > > > > > What do you guys think?
>>> > > > > > > >
>>> > > > > > >
>>> > > > > > > Are you suggesting that we can only reuse the same
>>> > > > FairAffinityFunction
>>> > > > > > > across the logical caches within the same group?
This would
>>> mean
>>> > > that
>>> > > > > > > caches from the different groups cannot participate
in JOINs
>>> or
>>> > > > > > collocated
>>> > > > > > > compute.
>>> > > > > > >
>>> > > > > > > I think I like the idea, however, we need to make
sure that
>>> we
>>> > > > enforce
>>> > > > > > this
>>> > > > > > > restriction, at least at the SQL JOIN level.
>>> > > > > > >
>>> > > > > > > Alexey G, Val, would be nice to hear your thoughts
on this.
>>> > > > > > >
>>> > > > > > >
>>> > > > > > > >
>>> > > > > > > > Evgenii
>>> > > > > > > >
>>> > > > > > > >
>>> > > > > > > >
>>> > > > > > > > --
>>> > > > > > > > View this message in context: http://apache-ignite-
>>> > > > > > > > developers.2346864.n4.nabble.com/Resurrect-
>>> > FairAffinityFunction-
>>> > > > > > > > tp19987p20669.html
>>> > > > > > > > Sent from the Apache Ignite Developers mailing
list
>>> archive at
>>> > > > > > > Nabble.com.
>>> > > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message