ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Plehanov <plehanov.a...@gmail.com>
Subject Re: Thin client: compute support
Date Tue, 26 Nov 2019 10:34:57 GMT
> Anyway, my point stands.
I can't agree. Why you don't want to use task id for this? After all, we
don't cancel request (request is already processed), we cancel the task. So
it's more convenient to use task id here.

> Can you please provide equivalent use case with existing "thick" client?
For example:
Cluster consists of one server node.
Client uses some cluster group filtration (for example forServers() cluster
group).
Client starts to send periodically (for example 1 per minute) long-term
(for example 1 hour long) tasks to the cluster.
Meanwhile, several server nodes joined the cluster.

In case of thick client: All server nodes will be used, tasks will be load
balanced.
In case of thin client: Only one server node will be used, client will
detect topology change after an hour.


вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn <ptupitsyn@apache.org>:

> >  I can't see any usage of request id in query cursors
> You are right, cursor id is a separate thing.
> Anyway, my point stands.
>
> > client sends long term tasks to nodes and wants to do it with load
> balancing
> I still don't get it. Can you please provide equivalent use case with
> existing "thick" client?
>
>
> On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov <plehanov.alex@gmail.com>
> wrote:
>
> > > And it is fine to use request ID to identify compute tasks (as we do
> with
> > query cursors).
> > I can't see any usage of request id in query cursors. We send query
> request
> > and get cursor id in response. After that, we only use cursor id (to get
> > next pages and to close the resource). Did I miss something?
> >
> > > Looks like I'm missing something - how is topology change relevant to
> > executing compute tasks from client?
> > It's not relevant directly. But there are some cases where it will be
> > helpful. For example, if client sends long term tasks to nodes and wants
> to
> > do it with load balancing it will detect topology change only after some
> > time in the future with the first response, so load balancing will no
> work.
> > Perhaps we can add optional "topology version" field to the
> > OP_COMPUTE_EXECUTE_TASK request to solve this problem.
> >
> >
> > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn <ptupitsyn@apache.org>:
> >
> > > Alex,
> > >
> > > > we will mix entities from different layers (transport layer and
> request
> > > body)
> > > I would not call our message header (which includes the id) "transport
> > > layer".
> > > TCP is our transport layer. And it is fine to use request ID to
> identify
> > > compute tasks (as we do with query cursors).
> > >
> > > > we still can't be sure that the task is successfully started on a
> > server
> > > The request to start the task will fail and we'll get a response
> > indicating
> > > that right away
> > >
> > > > we won't ever know about topology change
> > > Looks like I'm missing something - how is topology change relevant to
> > > executing compute tasks from client?
> > >
> > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov <
> plehanov.alex@gmail.com>
> > > wrote:
> > >
> > > > Pavel, in this case, we will mix entities from different layers
> > > (transport
> > > > layer and request body), it's not very good. The same behavior we can
> > > > achieve with generated on client-side task id, but there will be no
> > > > inter-layer data intersection and I think it will be easier to
> > implement
> > > on
> > > > both client and server-side. But we still can't be sure that the task
> > is
> > > > successfully started on a server. We won't ever know about topology
> > > change,
> > > > because topology changed flag will be sent from server to client only
> > > with
> > > > a response when the task will be completed. Are we accept that?
> > > >
> > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn <ptupitsyn@apache.org>:
> > > >
> > > > > Alex,
> > > > >
> > > > > I have a simpler idea. We already do request id handling in the
> > > protocol,
> > > > > so:
> > > > > - Client sends a normal request to execute compute task. Request
ID
> > is
> > > > > generated as usual.
> > > > > - As soon as task is completed, a response is received.
> > > > >
> > > > > As for cancellation - client can send a new request (with new
> request
> > > ID)
> > > > > and (in the body) pass the request ID from above
> > > > > as a task identifier. As a result, there are two responses:
> > > > > - Cancellation response
> > > > > - Task response (with proper cancelled status)
> > > > >
> > > > > That's it, no need to modify the core of the protocol. One request
> -
> > > one
> > > > > response.
> > > > >
> > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov <
> > plehanov.alex@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Pavel, we need to inform the client when the task is completed,
> we
> > > need
> > > > > the
> > > > > > ability to cancel the task. I see several ways to implement
this:
> > > > > >
> > > > > > 1. Сlient sends a request to the server to start a task, server
> > > return
> > > > > task
> > > > > > id in response. Server notifies client when task is completed
> with
> > a
> > > > new
> > > > > > request (from server to client). Client can cancel the task
by
> > > sending
> > > > a
> > > > > > new request with operation type "cancel" and task id. In this
> case,
> > > we
> > > > > > should implement 2-ways requests.
> > > > > > 2. Client generates unique task id and sends a request to the
> > server
> > > to
> > > > > > start a task, server don't reply immediately but wait until
task
> is
> > > > > > completed. Client can cancel task by sending new request with
> > > operation
> > > > > > type "cancel" and task id. In this case, we should decouple
> request
> > > and
> > > > > > response on the server-side (currently response is sent right
> after
> > > > > request
> > > > > > was processed). Also, we can't be sure that task is successfully
> > > > started
> > > > > on
> > > > > > a server.
> > > > > > 3. Client sends a request to the server to start a task, server
> > > return
> > > > id
> > > > > > in response. Client periodically asks the server about task
> status.
> > > > > Client
> > > > > > can cancel the task by sending new request with operation type
> > > "cancel"
> > > > > and
> > > > > > task id. This case brings some overhead to the communication
> > channel.
> > > > > >
> > > > > > Personally, I think that the case with 2-ways requests is better,
> > but
> > > > I'm
> > > > > > open to any other ideas.
> > > > > >
> > > > > > Aleksandr,
> > > > > >
> > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks
> > > > overcomplicated.
> > > > > Do
> > > > > > we need server-side filtering at all? Wouldn't it be better
to
> send
> > > > basic
> > > > > > info (ids, order, flags) for all nodes (there is relatively
small
> > > > amount
> > > > > of
> > > > > > data) and extended info (attributes) for selected list of nodes?
> In
> > > > this
> > > > > > case, we can do basic node filtration on client-side
> (forClients(),
> > > > > > forServers(), forNodeIds(), forOthers(), etc).
> > > > > >
> > > > > > Do you use standard ClusterNode serialization? There are also
> > metrics
> > > > > > serialized with ClusterNode, do we need it on thin client? There
> > are
> > > > > other
> > > > > > interfaces exist to show metrics, I think it's redundant to
> export
> > > > > metrics
> > > > > > to thin clients too.
> > > > > >
> > > > > > What do you think?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin <
> lexwert@gmail.com
> > >:
> > > > > >
> > > > > > > Alex,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I think you can create a new IEP page and I will fill it
with
> the
> > > > > Cluster
> > > > > > > API details.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > In short, I’ve introduced several new codes:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Cluster API is pretty straightforward:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > OP_CLUSTER_IS_ACTIVE = 5000
> > > > > > >
> > > > > > > OP_CLUSTER_CHANGE_STATE = 5001
> > > > > > >
> > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002
> > > > > > >
> > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Cluster group codes:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100
> > > > > > >
> > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > The underlying implementation is based on the thick client
> logic.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > For every request, we provide a known topology version
and if
> it
> > > has
> > > > > > > changed,
> > > > > > >
> > > > > > > a client updates it firstly and then re-sends the filtering
> > > request.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Alongside the topVer a client sends a serialized nodes
> projection
> > > > > object
> > > > > > >
> > > > > > > that could be considered as a code to value mapping.
> > > > > > >
> > > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”},
> {Code=2,
> > > > > > Value=1}]
> > > > > > >
> > > > > > > Where “1” stands for Attribute filtering and “2”
–
> > serverNodesOnly
> > > > > flag.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > As a result of request processing, a server sends nodeId
UUIDs
> > and
> > > a
> > > > > > > current topVer.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > When a client obtains nodeIds, it can perform a NODE_INFO
call
> to
> > > > get a
> > > > > > >
> > > > > > > serialized ClusterNode object. In addition there should
be a
> > > > different
> > > > > > API
> > > > > > >
> > > > > > > method for accessing/updating node metrics.
> > > > > > >
> > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov <
> > skozlov@gridgain.com
> > > >:
> > > > > > >
> > > > > > > > Hi Pavel
> > > > > > > >
> > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn <
> > > > > ptupitsyn@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > 1. I believe that Cluster operations for Thin
Client
> protocol
> > > are
> > > > > > > already
> > > > > > > > > in the works
> > > > > > > > > by Alexandr Shapkin. Can't find the ticket though.
> > > > > > > > > Alexandr, can you please confirm and attach the
ticket
> > number?
> > > > > > > > >
> > > > > > > > > 2. Proposed changes will work only for Java tasks
that are
> > > > already
> > > > > > > > deployed
> > > > > > > > > on server nodes.
> > > > > > > > > This is mostly useless for other thin clients
we have
> > (Python,
> > > > PHP,
> > > > > > > .NET,
> > > > > > > > > C++).
> > > > > > > > >
> > > > > > > >
> > > > > > > > I don't guess so. The task (execution) is a way to
implement
> > own
> > > > > layer
> > > > > > > for
> > > > > > > > the thin client application.
> > > > > > > >
> > > > > > > >
> > > > > > > > > We should think of a way to make this useful
for all
> clients.
> > > > > > > > > For example, we may allow sending tasks in some
scripting
> > > > language
> > > > > > like
> > > > > > > > > Javascript.
> > > > > > > > > Thoughts?
> > > > > > > > >
> > > > > > > >
> > > > > > > > The arbitrary code execution from a remote client
must be
> > > protected
> > > > > > > > from malicious code.
> > > > > > > > I don't know how it could be designed but without
that we
> open
> > > the
> > > > > hole
> > > > > > > to
> > > > > > > > kill cluster.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov
<
> > > > > skozlov@gridgain.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Alex
> > > > > > > > > >
> > > > > > > > > > The idea is great. But I have some concerns
that probably
> > > > should
> > > > > be
> > > > > > > > taken
> > > > > > > > > > into account for design:
> > > > > > > > > >
> > > > > > > > > >    1. We need to have the ability to stop
a task
> execution,
> > > > smth
> > > > > > like
> > > > > > > > > >    OP_COMPUTE_CANCEL_TASK  operation (client
to server)
> > > > > > > > > >    2. What's about task execution timeout?
It may help to
> > the
> > > > > > cluster
> > > > > > > > > >    survival for buggy tasks
> > > > > > > > > >    3. Ignite doesn't have roles/authorization
> functionality
> > > for
> > > > > > now.
> > > > > > > > But
> > > > > > > > > a
> > > > > > > > > >    task is the risky operation for cluster
(for security
> > > > > reasons).
> > > > > > > > Could
> > > > > > > > > we
> > > > > > > > > >    add for Ignite configuration new options:
> > > > > > > > > >       - Explicit turning on for compute
task support for
> > thin
> > > > > > > protocol
> > > > > > > > > >       (disabled by default) for whole cluster
> > > > > > > > > >       - Explicit turning on for compute
task support for
> a
> > > node
> > > > > > > > > >       - The list of task names (classes)
allowed to
> execute
> > > by
> > > > > thin
> > > > > > > > > client.
> > > > > > > > > >    4. Support the labeling for task that
may help to
> > > > investigate
> > > > > > > issues
> > > > > > > > > on
> > > > > > > > > >    cluster (the idea from IEP-34 [1])
> > > > > > > > > >
> > > > > > > > > > 1.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov
<
> > > > > > > > plehanov.alex@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hello, Igniters!
> > > > > > > > > > >
> > > > > > > > > > > I have plans to start implementation
of Compute
> interface
> > > for
> > > > > > > Ignite
> > > > > > > > > thin
> > > > > > > > > > > client and want to discuss features
that should be
> > > > implemented.
> > > > > > > > > > >
> > > > > > > > > > > We already have Compute implementation
for binary-rest
> > > > clients
> > > > > > > > > > > (GridClientCompute), which have the
following
> > > functionality:
> > > > > > > > > > > - Filtering cluster nodes (projection)
for compute
> > > > > > > > > > > - Executing task by the name
> > > > > > > > > > >
> > > > > > > > > > > I think we can implement this functionality
in a thin
> > > client
> > > > as
> > > > > > > well.
> > > > > > > > > > >
> > > > > > > > > > > First of all, we need some operation
types to request a
> > > list
> > > > of
> > > > > > all
> > > > > > > > > > > available nodes and probably node attributes
(by a list
> > of
> > > > > > nodes).
> > > > > > > > Node
> > > > > > > > > > > attributes will be helpful if we will
decide to
> implement
> > > > > analog
> > > > > > of
> > > > > > > > > > > ClusterGroup#forAttribute or ClusterGroup#forePredicate
> > > > methods
> > > > > > in
> > > > > > > > the
> > > > > > > > > > thin
> > > > > > > > > > > client. Perhaps they can be requested
lazily.
> > > > > > > > > > >
> > > > > > > > > > > From the protocol point of view there
will be two new
> > > > > operations:
> > > > > > > > > > >
> > > > > > > > > > > OP_CLUSTER_GET_NODES
> > > > > > > > > > > Request: empty
> > > > > > > > > > > Response: long topologyVersion, int
> minorTopologyVersion,
> > > int
> > > > > > > > > nodesCount,
> > > > > > > > > > > for each node set of node fields (UUID
nodeId, Object
> or
> > > > String
> > > > > > > > > > > consistentId, long order, etc)
> > > > > > > > > > >
> > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES
> > > > > > > > > > > Request: int nodesCount, for each node:
UUID nodeId
> > > > > > > > > > > Response: int nodesCount, for each
node: int
> > > attributesCount,
> > > > > for
> > > > > > > > each
> > > > > > > > > > node
> > > > > > > > > > > attribute: String name, Object value
> > > > > > > > > > >
> > > > > > > > > > > To execute tasks we need something
like these methods
> in
> > > the
> > > > > > client
> > > > > > > > > API:
> > > > > > > > > > > Object execute(String task, Object
arg)
> > > > > > > > > > > Future<Object> executeAsync(String
task, Object arg)
> > > > > > > > > > > Object affinityExecute(String task,
String cache,
> Object
> > > key,
> > > > > > > Object
> > > > > > > > > arg)
> > > > > > > > > > > Future<Object> affinityExecuteAsync(String
task, String
> > > > cache,
> > > > > > > Object
> > > > > > > > > > key,
> > > > > > > > > > > Object arg)
> > > > > > > > > > >
> > > > > > > > > > > Which can be mapped to protocol operations:
> > > > > > > > > > >
> > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK
> > > > > > > > > > > Request: UUID nodeId, String taskName,
Object arg
> > > > > > > > > > > Response: Object result
> > > > > > > > > > >
> > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY
> > > > > > > > > > > Request: String cacheName, Object key,
String taskName,
> > > > Object
> > > > > > arg
> > > > > > > > > > > Response: Object result
> > > > > > > > > > >
> > > > > > > > > > > The second operation is needed because
we sometimes
> can't
> > > > > > calculate
> > > > > > > > and
> > > > > > > > > > > connect to affinity node on the client-side
(affinity
> > > > awareness
> > > > > > can
> > > > > > > > be
> > > > > > > > > > > disabled, custom affinity function
can be used or there
> > can
> > > > be
> > > > > no
> > > > > > > > > > > connection between client and affinity
node), but we
> can
> > > make
> > > > > > best
> > > > > > > > > effort
> > > > > > > > > > > to send request to target node if affinity
awareness is
> > > > > enabled.
> > > > > > > > > > >
> > > > > > > > > > > Currently, on the server-side requests
always processed
> > > > > > > synchronously
> > > > > > > > > and
> > > > > > > > > > > responses are sent right after request
was processed.
> To
> > > > > execute
> > > > > > > long
> > > > > > > > > > tasks
> > > > > > > > > > > async we should whether change this
logic or introduce
> > some
> > > > > kind
> > > > > > > > > two-way
> > > > > > > > > > > communication between client and server
(now only
> one-way
> > > > > > requests
> > > > > > > > from
> > > > > > > > > > > client to server are allowed).
> > > > > > > > > > >
> > > > > > > > > > > Two-way communication can also be useful
in the future
> if
> > > we
> > > > > will
> > > > > > > > send
> > > > > > > > > > some
> > > > > > > > > > > server-side generated events to clients.
> > > > > > > > > > >
> > > > > > > > > > > In case of two-way communication there
can be new
> > > operations
> > > > > > > > > introduced:
> > > > > > > > > > >
> > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client
to server)
> > > > > > > > > > > Request: UUID nodeId, String taskName,
Object arg
> > > > > > > > > > > Response: long taskId
> > > > > > > > > > >
> > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server
to client)
> > > > > > > > > > > Request: taskId, Object result
> > > > > > > > > > > Response: empty
> > > > > > > > > > >
> > > > > > > > > > > The same for affinity requests.
> > > > > > > > > > >
> > > > > > > > > > > Also, we can implement not only execute
task operation,
> > but
> > > > > some
> > > > > > > > other
> > > > > > > > > > > operations from IgniteCompute (broadcast,
run, call),
> but
> > > it
> > > > > will
> > > > > > > be
> > > > > > > > > > useful
> > > > > > > > > > > only for java thin client. And even
with java thin
> client
> > > we
> > > > > > should
> > > > > > > > > > whether
> > > > > > > > > > > implement peer-class-loading for thin
clients (this
> also
> > > > > requires
> > > > > > > > > two-way
> > > > > > > > > > > client-server communication) or put
classes with
> executed
> > > > > > closures
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > server locally.
> > > > > > > > > > >
> > > > > > > > > > > What do you think about proposed protocol
changes?
> > > > > > > > > > > Do we need two-way requests between
client and server?
> > > > > > > > > > > Do we need support of compute methods
other than
> "execute
> > > > > task"?
> > > > > > > > > > > What do you think about peer-class-loading
for thin
> > > clients?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Sergey Kozlov
> > > > > > > > > > GridGain Systems
> > > > > > > > > > www.gridgain.com
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Sergey Kozlov
> > > > > > > > GridGain Systems
> > > > > > > > www.gridgain.com
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Alex.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message