ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavel Tupitsyn <ptupit...@apache.org>
Subject Re: Thin client: compute support
Date Tue, 26 Nov 2019 08:49:45 GMT
>  I can't see any usage of request id in query cursors
You are right, cursor id is a separate thing.
Anyway, my point stands.

> client sends long term tasks to nodes and wants to do it with load
balancing
I still don't get it. Can you please provide equivalent use case with
existing "thick" client?


On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov <plehanov.alex@gmail.com>
wrote:

> > And it is fine to use request ID to identify compute tasks (as we do with
> query cursors).
> I can't see any usage of request id in query cursors. We send query request
> and get cursor id in response. After that, we only use cursor id (to get
> next pages and to close the resource). Did I miss something?
>
> > Looks like I'm missing something - how is topology change relevant to
> executing compute tasks from client?
> It's not relevant directly. But there are some cases where it will be
> helpful. For example, if client sends long term tasks to nodes and wants to
> do it with load balancing it will detect topology change only after some
> time in the future with the first response, so load balancing will no work.
> Perhaps we can add optional "topology version" field to the
> OP_COMPUTE_EXECUTE_TASK request to solve this problem.
>
>
> пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn <ptupitsyn@apache.org>:
>
> > Alex,
> >
> > > we will mix entities from different layers (transport layer and request
> > body)
> > I would not call our message header (which includes the id) "transport
> > layer".
> > TCP is our transport layer. And it is fine to use request ID to identify
> > compute tasks (as we do with query cursors).
> >
> > > we still can't be sure that the task is successfully started on a
> server
> > The request to start the task will fail and we'll get a response
> indicating
> > that right away
> >
> > > we won't ever know about topology change
> > Looks like I'm missing something - how is topology change relevant to
> > executing compute tasks from client?
> >
> > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov <plehanov.alex@gmail.com>
> > wrote:
> >
> > > Pavel, in this case, we will mix entities from different layers
> > (transport
> > > layer and request body), it's not very good. The same behavior we can
> > > achieve with generated on client-side task id, but there will be no
> > > inter-layer data intersection and I think it will be easier to
> implement
> > on
> > > both client and server-side. But we still can't be sure that the task
> is
> > > successfully started on a server. We won't ever know about topology
> > change,
> > > because topology changed flag will be sent from server to client only
> > with
> > > a response when the task will be completed. Are we accept that?
> > >
> > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn <ptupitsyn@apache.org>:
> > >
> > > > Alex,
> > > >
> > > > I have a simpler idea. We already do request id handling in the
> > protocol,
> > > > so:
> > > > - Client sends a normal request to execute compute task. Request ID
> is
> > > > generated as usual.
> > > > - As soon as task is completed, a response is received.
> > > >
> > > > As for cancellation - client can send a new request (with new request
> > ID)
> > > > and (in the body) pass the request ID from above
> > > > as a task identifier. As a result, there are two responses:
> > > > - Cancellation response
> > > > - Task response (with proper cancelled status)
> > > >
> > > > That's it, no need to modify the core of the protocol. One request -
> > one
> > > > response.
> > > >
> > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov <
> plehanov.alex@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Pavel, we need to inform the client when the task is completed, we
> > need
> > > > the
> > > > > ability to cancel the task. I see several ways to implement this:
> > > > >
> > > > > 1. Сlient sends a request to the server to start a task, server
> > return
> > > > task
> > > > > id in response. Server notifies client when task is completed with
> a
> > > new
> > > > > request (from server to client). Client can cancel the task by
> > sending
> > > a
> > > > > new request with operation type "cancel" and task id. In this case,
> > we
> > > > > should implement 2-ways requests.
> > > > > 2. Client generates unique task id and sends a request to the
> server
> > to
> > > > > start a task, server don't reply immediately but wait until task
is
> > > > > completed. Client can cancel task by sending new request with
> > operation
> > > > > type "cancel" and task id. In this case, we should decouple request
> > and
> > > > > response on the server-side (currently response is sent right after
> > > > request
> > > > > was processed). Also, we can't be sure that task is successfully
> > > started
> > > > on
> > > > > a server.
> > > > > 3. Client sends a request to the server to start a task, server
> > return
> > > id
> > > > > in response. Client periodically asks the server about task status.
> > > > Client
> > > > > can cancel the task by sending new request with operation type
> > "cancel"
> > > > and
> > > > > task id. This case brings some overhead to the communication
> channel.
> > > > >
> > > > > Personally, I think that the case with 2-ways requests is better,
> but
> > > I'm
> > > > > open to any other ideas.
> > > > >
> > > > > Aleksandr,
> > > > >
> > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks
> > > overcomplicated.
> > > > Do
> > > > > we need server-side filtering at all? Wouldn't it be better to send
> > > basic
> > > > > info (ids, order, flags) for all nodes (there is relatively small
> > > amount
> > > > of
> > > > > data) and extended info (attributes) for selected list of nodes?
In
> > > this
> > > > > case, we can do basic node filtration on client-side (forClients(),
> > > > > forServers(), forNodeIds(), forOthers(), etc).
> > > > >
> > > > > Do you use standard ClusterNode serialization? There are also
> metrics
> > > > > serialized with ClusterNode, do we need it on thin client? There
> are
> > > > other
> > > > > interfaces exist to show metrics, I think it's redundant to export
> > > > metrics
> > > > > to thin clients too.
> > > > >
> > > > > What do you think?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin <lexwert@gmail.com
> >:
> > > > >
> > > > > > Alex,
> > > > > >
> > > > > >
> > > > > >
> > > > > > I think you can create a new IEP page and I will fill it with
the
> > > > Cluster
> > > > > > API details.
> > > > > >
> > > > > >
> > > > > >
> > > > > > In short, I’ve introduced several new codes:
> > > > > >
> > > > > >
> > > > > >
> > > > > > Cluster API is pretty straightforward:
> > > > > >
> > > > > >
> > > > > >
> > > > > > OP_CLUSTER_IS_ACTIVE = 5000
> > > > > >
> > > > > > OP_CLUSTER_CHANGE_STATE = 5001
> > > > > >
> > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002
> > > > > >
> > > > > > OP_CLUSTER_GET_WAL_STATE = 5003
> > > > > >
> > > > > >
> > > > > >
> > > > > > Cluster group codes:
> > > > > >
> > > > > >
> > > > > >
> > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100
> > > > > >
> > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101
> > > > > >
> > > > > >
> > > > > >
> > > > > > The underlying implementation is based on the thick client logic.
> > > > > >
> > > > > >
> > > > > >
> > > > > > For every request, we provide a known topology version and if
it
> > has
> > > > > > changed,
> > > > > >
> > > > > > a client updates it firstly and then re-sends the filtering
> > request.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Alongside the topVer a client sends a serialized nodes projection
> > > > object
> > > > > >
> > > > > > that could be considered as a code to value mapping.
> > > > > >
> > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”},
{Code=2,
> > > > > Value=1}]
> > > > > >
> > > > > > Where “1” stands for Attribute filtering and “2” –
> serverNodesOnly
> > > > flag.
> > > > > >
> > > > > >
> > > > > >
> > > > > > As a result of request processing, a server sends nodeId UUIDs
> and
> > a
> > > > > > current topVer.
> > > > > >
> > > > > >
> > > > > >
> > > > > > When a client obtains nodeIds, it can perform a NODE_INFO call
to
> > > get a
> > > > > >
> > > > > > serialized ClusterNode object. In addition there should be a
> > > different
> > > > > API
> > > > > >
> > > > > > method for accessing/updating node metrics.
> > > > > >
> > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov <
> skozlov@gridgain.com
> > >:
> > > > > >
> > > > > > > Hi Pavel
> > > > > > >
> > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn <
> > > > ptupitsyn@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > 1. I believe that Cluster operations for Thin Client
protocol
> > are
> > > > > > already
> > > > > > > > in the works
> > > > > > > > by Alexandr Shapkin. Can't find the ticket though.
> > > > > > > > Alexandr, can you please confirm and attach the ticket
> number?
> > > > > > > >
> > > > > > > > 2. Proposed changes will work only for Java tasks
that are
> > > already
> > > > > > > deployed
> > > > > > > > on server nodes.
> > > > > > > > This is mostly useless for other thin clients we have
> (Python,
> > > PHP,
> > > > > > .NET,
> > > > > > > > C++).
> > > > > > > >
> > > > > > >
> > > > > > > I don't guess so. The task (execution) is a way to implement
> own
> > > > layer
> > > > > > for
> > > > > > > the thin client application.
> > > > > > >
> > > > > > >
> > > > > > > > We should think of a way to make this useful for all
clients.
> > > > > > > > For example, we may allow sending tasks in some scripting
> > > language
> > > > > like
> > > > > > > > Javascript.
> > > > > > > > Thoughts?
> > > > > > > >
> > > > > > >
> > > > > > > The arbitrary code execution from a remote client must
be
> > protected
> > > > > > > from malicious code.
> > > > > > > I don't know how it could be designed but without that
we open
> > the
> > > > hole
> > > > > > to
> > > > > > > kill cluster.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov <
> > > > skozlov@gridgain.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Alex
> > > > > > > > >
> > > > > > > > > The idea is great. But I have some concerns that
probably
> > > should
> > > > be
> > > > > > > taken
> > > > > > > > > into account for design:
> > > > > > > > >
> > > > > > > > >    1. We need to have the ability to stop a task
execution,
> > > smth
> > > > > like
> > > > > > > > >    OP_COMPUTE_CANCEL_TASK  operation (client
to server)
> > > > > > > > >    2. What's about task execution timeout? It
may help to
> the
> > > > > cluster
> > > > > > > > >    survival for buggy tasks
> > > > > > > > >    3. Ignite doesn't have roles/authorization
functionality
> > for
> > > > > now.
> > > > > > > But
> > > > > > > > a
> > > > > > > > >    task is the risky operation for cluster (for
security
> > > > reasons).
> > > > > > > Could
> > > > > > > > we
> > > > > > > > >    add for Ignite configuration new options:
> > > > > > > > >       - Explicit turning on for compute task
support for
> thin
> > > > > > protocol
> > > > > > > > >       (disabled by default) for whole cluster
> > > > > > > > >       - Explicit turning on for compute task
support for a
> > node
> > > > > > > > >       - The list of task names (classes) allowed
to execute
> > by
> > > > thin
> > > > > > > > client.
> > > > > > > > >    4. Support the labeling for task that may
help to
> > > investigate
> > > > > > issues
> > > > > > > > on
> > > > > > > > >    cluster (the idea from IEP-34 [1])
> > > > > > > > >
> > > > > > > > > 1.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov
<
> > > > > > > plehanov.alex@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hello, Igniters!
> > > > > > > > > >
> > > > > > > > > > I have plans to start implementation of
Compute interface
> > for
> > > > > > Ignite
> > > > > > > > thin
> > > > > > > > > > client and want to discuss features that
should be
> > > implemented.
> > > > > > > > > >
> > > > > > > > > > We already have Compute implementation for
binary-rest
> > > clients
> > > > > > > > > > (GridClientCompute), which have the following
> > functionality:
> > > > > > > > > > - Filtering cluster nodes (projection) for
compute
> > > > > > > > > > - Executing task by the name
> > > > > > > > > >
> > > > > > > > > > I think we can implement this functionality
in a thin
> > client
> > > as
> > > > > > well.
> > > > > > > > > >
> > > > > > > > > > First of all, we need some operation types
to request a
> > list
> > > of
> > > > > all
> > > > > > > > > > available nodes and probably node attributes
(by a list
> of
> > > > > nodes).
> > > > > > > Node
> > > > > > > > > > attributes will be helpful if we will decide
to implement
> > > > analog
> > > > > of
> > > > > > > > > > ClusterGroup#forAttribute or ClusterGroup#forePredicate
> > > methods
> > > > > in
> > > > > > > the
> > > > > > > > > thin
> > > > > > > > > > client. Perhaps they can be requested lazily.
> > > > > > > > > >
> > > > > > > > > > From the protocol point of view there will
be two new
> > > > operations:
> > > > > > > > > >
> > > > > > > > > > OP_CLUSTER_GET_NODES
> > > > > > > > > > Request: empty
> > > > > > > > > > Response: long topologyVersion, int minorTopologyVersion,
> > int
> > > > > > > > nodesCount,
> > > > > > > > > > for each node set of node fields (UUID nodeId,
Object or
> > > String
> > > > > > > > > > consistentId, long order, etc)
> > > > > > > > > >
> > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES
> > > > > > > > > > Request: int nodesCount, for each node:
UUID nodeId
> > > > > > > > > > Response: int nodesCount, for each node:
int
> > attributesCount,
> > > > for
> > > > > > > each
> > > > > > > > > node
> > > > > > > > > > attribute: String name, Object value
> > > > > > > > > >
> > > > > > > > > > To execute tasks we need something like
these methods in
> > the
> > > > > client
> > > > > > > > API:
> > > > > > > > > > Object execute(String task, Object arg)
> > > > > > > > > > Future<Object> executeAsync(String
task, Object arg)
> > > > > > > > > > Object affinityExecute(String task, String
cache, Object
> > key,
> > > > > > Object
> > > > > > > > arg)
> > > > > > > > > > Future<Object> affinityExecuteAsync(String
task, String
> > > cache,
> > > > > > Object
> > > > > > > > > key,
> > > > > > > > > > Object arg)
> > > > > > > > > >
> > > > > > > > > > Which can be mapped to protocol operations:
> > > > > > > > > >
> > > > > > > > > > OP_COMPUTE_EXECUTE_TASK
> > > > > > > > > > Request: UUID nodeId, String taskName, Object
arg
> > > > > > > > > > Response: Object result
> > > > > > > > > >
> > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY
> > > > > > > > > > Request: String cacheName, Object key, String
taskName,
> > > Object
> > > > > arg
> > > > > > > > > > Response: Object result
> > > > > > > > > >
> > > > > > > > > > The second operation is needed because we
sometimes can't
> > > > > calculate
> > > > > > > and
> > > > > > > > > > connect to affinity node on the client-side
(affinity
> > > awareness
> > > > > can
> > > > > > > be
> > > > > > > > > > disabled, custom affinity function can be
used or there
> can
> > > be
> > > > no
> > > > > > > > > > connection between client and affinity node),
but we can
> > make
> > > > > best
> > > > > > > > effort
> > > > > > > > > > to send request to target node if affinity
awareness is
> > > > enabled.
> > > > > > > > > >
> > > > > > > > > > Currently, on the server-side requests always
processed
> > > > > > synchronously
> > > > > > > > and
> > > > > > > > > > responses are sent right after request was
processed. To
> > > > execute
> > > > > > long
> > > > > > > > > tasks
> > > > > > > > > > async we should whether change this logic
or introduce
> some
> > > > kind
> > > > > > > > two-way
> > > > > > > > > > communication between client and server
(now only one-way
> > > > > requests
> > > > > > > from
> > > > > > > > > > client to server are allowed).
> > > > > > > > > >
> > > > > > > > > > Two-way communication can also be useful
in the future if
> > we
> > > > will
> > > > > > > send
> > > > > > > > > some
> > > > > > > > > > server-side generated events to clients.
> > > > > > > > > >
> > > > > > > > > > In case of two-way communication there can
be new
> > operations
> > > > > > > > introduced:
> > > > > > > > > >
> > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to
server)
> > > > > > > > > > Request: UUID nodeId, String taskName, Object
arg
> > > > > > > > > > Response: long taskId
> > > > > > > > > >
> > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to
client)
> > > > > > > > > > Request: taskId, Object result
> > > > > > > > > > Response: empty
> > > > > > > > > >
> > > > > > > > > > The same for affinity requests.
> > > > > > > > > >
> > > > > > > > > > Also, we can implement not only execute
task operation,
> but
> > > > some
> > > > > > > other
> > > > > > > > > > operations from IgniteCompute (broadcast,
run, call), but
> > it
> > > > will
> > > > > > be
> > > > > > > > > useful
> > > > > > > > > > only for java thin client. And even with
java thin client
> > we
> > > > > should
> > > > > > > > > whether
> > > > > > > > > > implement peer-class-loading for thin clients
(this also
> > > > requires
> > > > > > > > two-way
> > > > > > > > > > client-server communication) or put classes
with executed
> > > > > closures
> > > > > > to
> > > > > > > > the
> > > > > > > > > > server locally.
> > > > > > > > > >
> > > > > > > > > > What do you think about proposed protocol
changes?
> > > > > > > > > > Do we need two-way requests between client
and server?
> > > > > > > > > > Do we need support of compute methods other
than "execute
> > > > task"?
> > > > > > > > > > What do you think about peer-class-loading
for thin
> > clients?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Sergey Kozlov
> > > > > > > > > GridGain Systems
> > > > > > > > > www.gridgain.com
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Sergey Kozlov
> > > > > > > GridGain Systems
> > > > > > > www.gridgain.com
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Alex.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message