ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexei Scherbakov <alexey.scherbak...@gmail.com>
Subject Re: Changing public IgniteCompute API to improve changes in 5037 ticket
Date Sat, 29 Jul 2017 12:04:25 GMT
Anton,

I'm fine with proposed solution using AffinityComputeTask.

Please take care of issues pointed by me while implementing.

Each job should somehow know the partition it's running over, on example,
for setting it for ScanQuery or SqlQuery over local partition.

affinityExecute must support nullable set of partitions for splitting over
all available partitions.

Val, any task requiring map/reduce API and data collocation is eligible.

On example, in banking, it would be task sorting all depositors by amount
of money on deposit or some other complex criteria, or calculation of
income for each deposit.

My client(bank) has plenty of such tasks.

2017-07-26 20:57 GMT+03:00 Valentin Kulichenko <
valentin.kulichenko@gmail.com>:

> Anton,
>
> This seems to be a completely separate issue. I don't see how it can be
> fixed by adding new APIs.
>
> -Val
>
> On Wed, Jul 26, 2017 at 3:56 AM, Anton Vinogradov <
> avinogradov@gridgain.com>
> wrote:
>
> > Val,
> >
> > AFAIK, affinityRun/Call has guarantee to be successfully executed on
> > unstable topology in case partition was not losed, only relocated to
> > another node during rebalancing.
> >
> > On Tue, Jul 25, 2017 at 10:44 PM, Valentin Kulichenko <
> > valentin.kulichenko@gmail.com> wrote:
> >
> > > Alexey,
> > >
> > > Is there exact use case that is currently not supported? I really would
> > > like to see one, because such a big API change should add clear value.
> > > ComputeGrid is not used very often, and so far I've never seen any
> > > questions from users about using it in conjunction with affinity
> > > collocation.
> > >
> > > What if we solve this on job level instead by adding the following
> > > interface:
> > >
> > > interface AffinityComputeJob extends ComputeJob {
> > >     String cacheName();
> > >     Object affinityKey();
> > > }
> > >
> > > Whenever load balancer sees this job, it maps it based on affinity.
> Will
> > > this work?
> > >
> > > -Val
> > >
> > > On Tue, Jul 25, 2017 at 12:37 PM, Valentin Kulichenko <
> > > valentin.kulichenko@gmail.com> wrote:
> > >
> > > > Anton,
> > > >
> > > > How does topology change break this functionality? Closures executed
> > with
> > > > affinityRun/Call fail over in the same way as any ComputeJob.
> > > >
> > > > -Val
> > > >
> > > > On Tue, Jul 25, 2017 at 5:48 AM, Anton Vinogradov <
> > > > avinogradov@gridgain.com> wrote:
> > > >
> > > >> Alexei,
> > > >>
> > > >> > How would task know the partition it is running over ?
> > > >> Not sure it necessary.
> > > >> You'll create pair partition-job at task's map phase.
> > > >>
> > > >> > How can I assign task for each cache partition ?
> > > >> Just implement map method generates map with size equals to
> partition
> > > >> count.
> > > >>
> > > >> > How can I enforce partition reservation if task works with
> multiple
> > > >> caches at once ?
> > > >> This possible only in case caches use safe affinity function.
> > > >> And it useful only it this case.
> > > >>
> > > >> On Tue, Jul 25, 2017 at 3:22 PM, Alexei Scherbakov <
> > > >> alexey.scherbakoff@gmail.com> wrote:
> > > >>
> > > >> > Please read job instead task
> > > >> >
> > > >> > 2017-07-25 15:20 GMT+03:00 Alexei Scherbakov <
> > > >> alexey.scherbakoff@gmail.com
> > > >> > >:
> > > >> >
> > > >> > > Main point of the issue is to provide clean API for working
with
> > > >> > > computations requiring data collocation
> > > >> > >
> > > >> > > affinityCall/Run provide the ability to run closure near
data,
> but
> > > >> > > map/reduce API is a way reacher: continuous mapping, task
> session,
> > > >> etc.
> > > >> > >
> > > >> > > As for proposed API, I do not understand fully how it solves
the
> > > >> problem.
> > > >> > >
> > > >> > > Maxim, please provide detailed javadoc for each method and
each
> > > >> argument
> > > >> > > for presented API, and the answers to the following questions:
> > > >> > >
> > > >> > > 1. How would task know the partition it is running over
?
> > > >> > >
> > > >> > > 2. How can I assign task for each cache partition ?
> > > >> > >
> > > >> > > 3. How can I enforce partition reservation if task works
with
> > > multiple
> > > >> > > caches at once ?
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > 2017-07-25 12:30 GMT+03:00 Anton Vinogradov <
> > > avinogradov@gridgain.com
> > > >> >:
> > > >> > >
> > > >> > >> Val,
> > > >> > >>
> > > >> > >> Sure, we can, but we'd like to use map/reduce without
fearing
> > that
> > > >> > >> topology
> > > >> > >> can change.
> > > >> > >>
> > > >> > >> On Mon, Jul 24, 2017 at 11:17 PM, Valentin Kulichenko
<
> > > >> > >> valentin.kulichenko@gmail.com> wrote:
> > > >> > >>
> > > >> > >> > Anton,
> > > >> > >> >
> > > >> > >> > You can call affinityCallAsync multiple times and
then reduce
> > > >> locally.
> > > >> > >> >
> > > >> > >> > -Val
> > > >> > >> >
> > > >> > >> > On Mon, Jul 24, 2017 at 3:05 AM, Anton Vinogradov
<
> > > >> > >> > avinogradov@gridgain.com>
> > > >> > >> > wrote:
> > > >> > >> >
> > > >> > >> > > Val,
> > > >> > >> > >
> > > >> > >> > > > What is the use case for which current
affinityRun/Call
> API
> > > >> > doesn't
> > > >> > >> > work?
> > > >> > >> > > It does not work for map/reduce.
> > > >> > >> > >
> > > >> > >> > > On Fri, Jul 21, 2017 at 11:42 PM, Valentin
Kulichenko <
> > > >> > >> > > valentin.kulichenko@gmail.com> wrote:
> > > >> > >> > >
> > > >> > >> > > > Maxim,
> > > >> > >> > > >
> > > >> > >> > > > The issue is that it's currently assumed
to support job
> > > >> mapping,
> > > >> > >> but it
> > > >> > >> > > > actually doesn't. However, I agree that
AffinityKeyMapped
> > > >> > annotation
> > > >> > >> > > > doesn't fit the use case well. Let's
fix documentation
> and
> > > >> JavaDoc
> > > >> > >> > then.
> > > >> > >> > > >
> > > >> > >> > > > As for the proposed API, it's overcomplicated,
took me 15
> > > >> minutes
> > > >> > to
> > > >> > >> > > > understand what it does :)
> > > >> > >> > > >
> > > >> > >> > > > What is the use case for which current
affinityRun/Call
> API
> > > >> > doesn't
> > > >> > >> > work?
> > > >> > >> > > >
> > > >> > >> > > > -Val
> > > >> > >> > > >
> > > >> > >> > > > On Fri, Jul 21, 2017 at 5:57 AM, Kozlov
Maxim <
> > > >> > dreamx.max@gmail.com
> > > >> > >> >
> > > >> > >> > > > wrote:
> > > >> > >> > > >
> > > >> > >> > > > > Valentin,
> > > >> > >> > > > >
> > > >> > >> > > > > The author of tiket wants to see
to provide some API
> > allows
> > > >> to
> > > >> > map
> > > >> > >> > > > > ComputeJobs to partitions or keys.
If we use
> > > >> @AffinityKeyMapped
> > > >> > >> then
> > > >> > >> > > you
> > > >> > >> > > > > need to enter the cache name parameter,
I think this is
> > not
> > > >> > >> > convenient
> > > >> > >> > > > for
> > > >> > >> > > > > the user. Therefore, I propose to
extend the existing
> > API.
> > > >> > >> > > > > Having consulted with Anton V. decided
to make a
> separate
> > > >> > >> interface
> > > >> > >> > > > > ReducibleTask, which will allow
us to have different
> map
> > > >> logic
> > > >> > at
> > > >> > >> > each
> > > >> > >> > > > > inheritor.
> > > >> > >> > > > >
> > > >> > >> > > > > Old method, allows to map to node
> > > >> > >> > > > > public interface ComputeTask<T,
R> extends
> > > ReducibleTask<R> {
> > > >> > >> > > > >     @Nullable public Map<? extends
ComputeJob,
> > ClusterNode>
> > > >> > >> > > > > map(List<ClusterNode> subgrid,
@Nullable T arg) throws
> > > >> > >> > IgniteException;
> > > >> > >> > > > > }
> > > >> > >> > > > >
> > > >> > >> > > > > Brand new method with mapping to
partitions, which
> solves
> > > >> > topology
> > > >> > >> > > change
> > > >> > >> > > > > issues.
> > > >> > >> > > > > public interface AffinityComputeTask<T,
R> extends
> > > >> > >> ReducibleTask<R> {
> > > >> > >> > > > >     @Nullable public Map<? extends
ComputeJob, Integer>
> > > >> > >> > > > map(@NotnullString
> > > >> > >> > > > > cacheName, List<Integer> partIds,
@Nullable T arg)
> throws
> > > >> > >> > > > IgniteException;
> > > >> > >> > > > > }
> > > >> > >> > > > >
> > > >> > >> > > > > public interface ReducibleTask<R>
extends Serializable
> {
> > > >> > >> > > > >     public ComputeJobResultPolicy
> result(ComputeJobResult
> > > >> res,
> > > >> > >> > > > > List<ComputeJobResult> rcvd)
throws IgniteException;
> > > >> > >> > > > >
> > > >> > >> > > > >     @Nullable public R reduce(List<ComputeJobResult>
> > > results)
> > > >> > >> throws
> > > >> > >> > > > > IgniteException;
> > > >> > >> > > > > }
> > > >> > >> > > > >
> > > >> > >> > > > > We also need to implement AffinityComputeTaskAdapter
> and
> > > >> > >> > > > > AffinityComputeTaskSplitAdapter,
for implementation by
> > > >> default.
> > > >> > >> It
> > > >> > >> > is
> > > >> > >> > > > > right?
> > > >> > >> > > > >
> > > >> > >> > > > > In the IgniteCompute add:
> > > >> > >> > > > >
> > > >> > >> > > > > @IgniteAsyncSupported
> > > >> > >> > > > > public <T, R> R affinityExecute(Class<?
extends
> > > >> > >> > AffinityComputeTask<T,
> > > >> > >> > > > R>>
> > > >> > >> > > > > taskCls, List<Integer> partIds,
@Nullable T arg) throws
> > > >> > >> > > IgniteException;
> > > >> > >> > > > > @IgniteAsyncSupported
> > > >> > >> > > > > public <T, R> R affinityExecute(AffinityComputeTask<T,
> > R>
> > > >> task,
> > > >> > >> > > > > List<Integer> partIds, @Nullable
T arg) throws
> > > >> IgniteException;
> > > >> > >> > > > >
> > > >> > >> > > > > public <T, R> ComputeTaskFuture<R>
> > > >> affinityExecuteAsync(Class<?
> > > >> > >> > extends
> > > >> > >> > > > > AffinityComputeTask<T, R>>
taskCls, List<Integer>
> > partIds,
> > > >> > >> @Nullable
> > > >> > >> > T
> > > >> > >> > > > arg)
> > > >> > >> > > > > throws IgniteException;
> > > >> > >> > > > > public <T, R> ComputeTaskFuture<R>
> affinityExecuteAsync(
> > > >> > >> > > > AffinityComputeTask<T,
> > > >> > >> > > > > R> task, List<Integer>
partIds, @Nullable T arg) throws
> > > >> > >> > > IgniteException;
> > > >> > >> > > > >
> > > >> > >> > > > >
> > > >> > >> > > > > How do you like this idea or do
you insist that you
> need
> > to
> > > >> use
> > > >> > >> > > > > @AffinityKeyMapped to solve the
problem?
> > > >> > >> > > > >
> > > >> > >> > > > >
> > > >> > >> > > > > > 13 июля 2017 г., в 6:36,
Valentin Kulichenko <
> > > >> > >> > > > > valentin.kulichenko@gmail.com>
написал(а):
> > > >> > >> > > > > >
> > > >> > >> > > > > > Hi Max,
> > > >> > >> > > > > >
> > > >> > >> > > > > > This ticket doesn't assume
any API changes, it's
> about
> > > >> broken
> > > >> > >> > > > > > functionality. I would start
with checking what tests
> > we
> > > >> have
> > > >> > >> > > > > > for @AffinityKeyMapped and
creating missing one. From
> > > what
> > > >> I
> > > >> > >> > > understand
> > > >> > >> > > > > > functionality is broken completely
or almost
> > completely,
> > > >> so I
> > > >> > >> guess
> > > >> > >> > > > > testing
> > > >> > >> > > > > > coverage is very weak there.
> > > >> > >> > > > > >
> > > >> > >> > > > > > -Val
> > > >> > >> > > > > >
> > > >> > >> > > > > > On Wed, Jul 12, 2017 at 4:27
PM, Kozlov Maxim <
> > > >> > >> > dreamx.max@gmail.com>
> > > >> > >> > > > > wrote:
> > > >> > >> > > > > >
> > > >> > >> > > > > >> Igniters,
> > > >> > >> > > > > >>
> > > >> > >> > > > > >> jira: https://issues.apache.org/
> > jira/browse/IGNITE-5037
> > > <
> > > >> > >> > > > > >> https://issues.apache.org/jira/browse/IGNITE-5037>
> > > >> > >> > > > > >> How do you look to solve
this ticket by adding two
> > > >> methods to
> > > >> > >> the
> > > >> > >> > > > public
> > > >> > >> > > > > >> IgniteCompute API?
> > > >> > >> > > > > >>
> > > >> > >> > > > > >> @IgniteAsyncSupported
> > > >> > >> > > > > >> public void affinityRun(@NotNull
Collection<String>
> > > >> > cacheNames,
> > > >> > >> > > > > >> Collection<Object>
keys, IgniteRunnable job)
> > > >> > >> > > > > >>    throws IgniteException;
> > > >> > >> > > > > >>
> > > >> > >> > > > > >> @IgniteAsyncSupported
> > > >> > >> > > > > >> public <R> R affinityCall(@NotNull
> Collection<String>
> > > >> > >> cacheNames,
> > > >> > >> > > > > >> Collection<Object>
keys, IgniteCallable<R> job)
> > > >> > >> > > > > >>    throws IgniteException;
> > > >> > >> > > > > >>
> > > >> > >> > > > > >> There is also a question
of how to act when changing
> > the
> > > >> > >> topology
> > > >> > >> > > > during
> > > >> > >> > > > > >> the execution of the job.
> > > >> > >> > > > > >> 1) complete with an exception;
> > > >> > >> > > > > >> 2) stop execution and wait
until the topology is
> > rebuilt
> > > >> and
> > > >> > >> > > continue
> > > >> > >> > > > > >> execution;
> > > >> > >> > > > > >>
> > > >> > >> > > > > >> I think the second way,
do you think?
> > > >> > >> > > > > >>
> > > >> > >> > > > > >> --
> > > >> > >> > > > > >> Best Regards,
> > > >> > >> > > > > >> Max K.
> > > >> > >> > > > > >>
> > > >> > >> > > > > >>
> > > >> > >> > > > > >>
> > > >> > >> > > > > >>
> > > >> > >> > > > > >>
> > > >> > >> > > > >
> > > >> > >> > > > > --
> > > >> > >> > > > > Best Regards,
> > > >> > >> > > > > Max K.
> > > >> > >> > > > >
> > > >> > >> > > > >
> > > >> > >> > > > >
> > > >> > >> > > > >
> > > >> > >> > > > >
> > > >> > >> > > >
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > --
> > > >> > >
> > > >> > > Best regards,
> > > >> > > Alexei Scherbakov
> > > >> > >
> > > >> >
> > > >> >
> > > >> >
> > > >> > --
> > > >> >
> > > >> > Best regards,
> > > >> > Alexei Scherbakov
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>



-- 

Best regards,
Alexei Scherbakov

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message