ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Setrakyan <dsetrak...@apache.org>
Subject Re: Service grid redesign
Date Fri, 23 Mar 2018 15:12:54 GMT
I think it is about time we take another look at our service functionality.
All the points you have raised sound reasonable to me.

On Fri, Mar 23, 2018 at 6:01 PM, Denis Mekhanikov <dmekhanikov@gmail.com>
wrote:

> Igniters,
>
> I'd like to start a discussion on Ignite service grid redesign.
> We have a number of problems in our current architecture, that have to be
> addressed.
>
> Here are the most severe ones:
>
> One of them is lack of guarantee, that service is successfully deployed and
> ready for work by the time, when *IgniteService.deploy*()* methods return.
> Furthermore, if an exception is thrown from *Service.init() *method, then
> the deploying side is not able to receive it, or even understand, that
> service is in unusable state.
> So, you may end up in such situation, when you deployed a service without
> receiving any errors, then called a service's method, and hung indefinitely
> on this invocation.
> JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-3392
>
> Another problem is locking during service deployment on unstable topology.
> This issue is caused by missing updates in continuous query listeners on
> the internal cache.
> It is hard to reproduce, but it happens sometimes. We shouldn't allow such
> possibility, that deployment methods hang without saying anything.
> JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-6259
>
> I think, we should change the deployment procedure to make it more
> reliable.
> Moving from operating over internal replicated service cache to sending
> custom discovery events seems to be a good idea.
> Service deployment may trigger a discovery event, that will make chosen
> nodes deploy the service, and the same event will notify other nodes about
> the deployed service instances.
> It will eliminate the need for distributed transactions on the internal
> replicated system cache, and make the service deployment protocol more
> transparent.
>
> There are a few points, that should be taken into account though.
>
> First of all, we can't wait for services to be deployed and initialised in
> the discovery thread.
> So, we need to make notification about service deployment result
> asynchronous, presumably over communication protocol.
> I can think of a procedure similar to the current exchange protocol, when
> service deployment is initialised with an initial discovery message,
> followed by asynchronous notifications from the hosting servers over
> communication. And finally, one more discovery message will notify all
> nodes about the service deployment result and location of the deployed
> service instances. Coordinator will be responsible for collecting of the
> deployment results in this scheme.
>
> Another problem is failover in case, when some nodes fail during deployment
> or further work.
> The following cases should be handled:
>
>    1. coordinator failure during deployment;
>    2. failure of nodes, that were chosen to host the service, during
>    deployment;
>    3. failure of nodes, that contain deployed services, after the
>    deployment.
>
> The first case may be resolved by either continuation of deployment with a
> new coordinator, or by cancelling it.
> The second case will require another node to be chosen and notified. Maybe
> another discovery message will be needed.
> The third case will require redeployment, so coordinator should track
> topology changes and redeploy failed services.
>
> Another good improvement would be service versioning. This matter was
> already discussed in another thread:
> http://apache-ignite-developers.2346864.n4.nabble.com/Service-versioning-
> td20858.html
> Let's resume this discussion and state the final decision here.
> This feature is closely connected to peer class loading, which is not
> working for services currently.
> So, service versioning should be implemented along with peer class loading.
> JIRA ticket for versioning:
> https://issues.apache.org/jira/browse/IGNITE-6069
> Peer class loading: https://issues.apache.org/jira/browse/IGNITE-975
>
> Please share your thoughts. Constructive criticism is highly appreciated.
>
> Denis
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message