ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Magda <dma...@gridgain.com>
Subject Re: Distributed queue problem with peerClassLoading enabled
Date Wed, 24 Feb 2016 15:07:31 GMT
Hi Mateusz,

Please see inline

On 2/17/2016 11:30 AM, mp wrote:
> Denis,
>
> Please see below for answers.
>
> Cheers,
> -Mateusz
>
> On Tue, Feb 16, 2016 at 10:18 PM, Denis Magda <dmagda@gridgain.com 
> <mailto:dmagda@gridgain.com>> wrote:
>
>     Hi Mateusz,
>
>     I've revisited the whole discussion from the beginning and should
>     say that the solution based on the distributed queue won't work
>     for you even if all the issues listed below are fixed.
>
>     Presently you're placing in the queue tasks coming form different
>     nodes with different class versions. Even if the tasks are stored
>     in the queue in a binary form they have to be deserialized to an
>     original form before execution. This will lead to
>     ClassNotFoundExceptions in your case.
>
>
> No, I'm not filling the queue with different class versions. My test 
> case is quite simple. Please check the original description specified 
> in 
> http://apache-ignite-users.70518.x6.nabble.com/Distributed-queue-problem-with-peerClassLoading-enabled-tp1762p1780.html
>
> In 1.4.0 it works as follows:
>
> 1. A server node is started.
> 2. Client node (which acts as a server as well) is started, creates a 
> queue, fills it with 100 tasks, and both nodes are polling tasks from 
> the queue. When the queue is empty (all tasks called), the test ends. 
> At this point the client node leaves the cluster as well.
> 3. Then you run the same test again *without any code modifications*, 
> ie, exactly the same class on the same client node. I assume that even 
> if the server node already cached the class during the 1st test run, 
> it should be no problem for the 2nd test run. But unfortunately it 
> fails with "ignitetest.Task cannot be cast to ignitetest.Task", which 
> I believe is due to class loaders being different.
>
> In 1.5.0 even the 1st run of the test fails due to a problem you 
> reported in https://issues.apache.org/jira/browse/IGNITE-2339
>
It will pass the 1st run if you switch to OptimizedMarshaller that is a 
default one in 1.4.0. Just set IgniteConfiguration.setMarshaller(new 
OptimizedMarshaller()) and you will see that everything will work the 
same in 1.5.0.
However the second run will fail as in case with 1.4.0.

> Actually, what I would like to achieve is the following (in 3 steps):
>
> 1. Have my original test pass, regardless of how many times I call the 
> test. This is a baseline.
> 2. Modify my test so that after I finish each test run, I can *modify* 
> the code of Task class, and my change is reflected in the subsequent 
> test run, ie, the server node will process the Task class with its new 
> definition.
> 3. Multiple client nodes can have their own Task class definitions 
> that run on the cluster at the same time, assuming that they create 
> *different* queues for their tasks. Actually they will also broadcast 
> their own IgniteCallables to the cluster (note that in particular they 
> can broadcast the same IgniteCallable class, each in its own different 
> version). But I'm always assuming that a given queue is 
> created/maintained/filled by one client node therefore it will never 
> mix objects with different versions of the same class.
>
>
>     My suggestion is the same is it was before. Avoid usage of the
>     distributed queue but rather start sending ignite compute tasks
>     for execution from the beginning.
>
>     Will this work for you?
>
>
> In my setup, it is simply more logical to model the computation as 
> IgniteCallables being broadcast to the cluster and then they pull 
> tasks from a distributed queue. Actually, the distributed queue (and 
> other distributed structures provided by Ignite) was one of main 
> reasons that attracted me to try it out. And I really like them and 
> their simple usage :)
> Without the use of distributed structures I think managing my 
> computations would probably be more cumbersome. But I will of course 
> think about it, and maybe I will come up with a suitable model.

Seems that I got the idea why you had decided to use a distributed queue 
in this particular case. This way you try to send several jobs for 
execution at once and wants them to be executed uniformly by all nodes 
that are available.
Is this assumption correct?

If I'm right then I will suggest you create your own ComputeTask that 
will be split and mapped to "less" loaded nodes using a load balancer.
I've prepared a code sample for you [1]. Please go through it. If 
anything is unclear then let me know.

Here [2] you can read more on load balancing that is implemented and 
used by compute engine out of the box.

[1] https://gist.github.com/dmagda/3a78e7d01d9fffd7bc66
[2] https://apacheignite.readme.io/docs/load-balancing

Regards,
Denis

>
>
>     --
>     Denis
>
>
>
>     On 2/16/2016 11:33 AM, mp wrote:
>>     Hi Denis,
>>
>>     Many thanks! I look forward to 1.6 then.
>>     Please also consider the following statement made by Dmitriy on
>>     Nov 03, 2015 (see his message in the thread):
>>
>>     "With that in mind, we will be removing the requirement for
>>     caches to work only with SHARED and CONTINUOUS deployment modes,
>>     so you will be able to use PRIVATE or ISOLATED deployment modes
>>     to deploy your computations."
>>
>>     As far as I understand, the above planned change is not covered
>>     by any Jira ticket.
>>
>>     Cheers,
>>     -Mateusz
>>
>>
>>
>>     On Fri, Feb 12, 2016 at 10:35 PM, Denis Magda
>>     <dmagda@gridgain.com <mailto:dmagda@gridgain.com>> wrote:
>>
>>         Hi Mateusz,
>>
>>         I assigned both tickets that you have problems with on
>>         myself. They will be fixed as a part of the next release.
>>         https://issues.apache.org/jira/browse/IGNITE-2339
>>         https://issues.apache.org/jira/browse/IGNITE-1823
>>
>>         There is one more issue that was reproduced locally and
>>         refers to unexpected cache undeployment when the binary
>>         marshaller is used.
>>         https://issues.apache.org/jira/browse/IGNITE-2647
>>
>>         Thanks for your patience and still showing the interest in
>>         Ignite.
>>
>>         Regards,
>>         Denis
>>
>>
>>         On 2/12/2016 4:41 PM, mp wrote:
>>>         Hi Denis,
>>>
>>>         But my test still fails in version 1.5 with default (ie,
>>>         binary) marshaller. See my message from January 7, and your
>>>         reply in which you mentioned a new Jira ticked for a bug
>>>         concerning the new binary marshaller:
>>>         https://issues.apache.org/jira/browse/IGNITE-2339
>>>
>>>         Basically, my test case (see
>>>         https://issues.apache.org/jira/browse/IGNITE-1823 ) fails in
>>>         all of the scenarios I tried:
>>>
>>>         1. Binary marshaller + default deployment mode
>>>         2. Binary marshaller + shared deployment mode
>>>         3. Binary marshaller + private deployment mode
>>>         4. Optimized marshaller + default deployment mode
>>>         5. Optimized marshaller + shared deployment mode
>>>         6. Optimized marshaller + private deployment mode
>>>
>>>         Would you have any hint/advice on how I could proceed? Is
>>>         there any chance of fixing the issues related to my test case?
>>>
>>>         Thanks for your help,
>>>         -Mateusz
>>>
>>>
>>>         On Wed, Feb 10, 2016 at 4:46 PM, Denis Magda
>>>         <dmagda@gridgain.com <mailto:dmagda@gridgain.com>> wrote:
>>>
>>>             Hi Mateusz,
>>>
>>>             In version 1.5 we released the binary objects [1] format
>>>             that allows to store cache in class version independent
>>>             form. Thus you don't need to have any classes on server
>>>             side.
>>>             This ability allows dynamic change to an objects
>>>             structure, and even allows multiple clients with
>>>             different versions of class definitions to co-exist.
>>>
>>>             In my understanding if you switch to this format you
>>>             will be able to support your use case.
>>>
>>>             If something is unclear don't hesitate to ask.
>>>
>>>             [1] https://apacheignite.readme.io/docs/binary-marshaller
>>>
>>>             --
>>>             Denis
>>>
>>>
>>>             On 2/10/2016 4:06 PM, mp wrote:
>>>>             Hi Denis,
>>>>
>>>>             Thanks for your reply.
>>>>             So, summing up, it seems that in the context of my use
>>>>             case, version 1.5 does not differ from 1.4? Which means
>>>>             that I still cannot achieve my goal: different versions
>>>>             of the same class (from different clients) running on
>>>>             the cluster at the same time?
>>>>
>>>>             As far as I understand this involves:
>>>>             1. https://issues.apache.org/jira/browse/IGNITE-1823
>>>>             2. https://issues.apache.org/jira/browse/IGNITE-2339
>>>>             3. Removing the requirement for caches to work only
>>>>             with SHARED and CONTINUOUS deployment modes (this was
>>>>             announced by Dmitriy in
>>>>             http://apache-ignite-users.70518.x6.nabble.com/Distributed-queue-problem-with-peerClassLoading-enabled-tp1762p1829.html
>>>>             )
>>>>
>>>>             Is there any chance the above use case will be possible
>>>>             in near future (any upcoming version)?
>>>>
>>>>             I really like the API and concept of Ignite. If only I
>>>>             could achieve the above scenario...
>>>>
>>>>             Cheers,
>>>>             -Mateusz
>>>>
>>>>
>>>>
>>>>             On Thu, Jan 7, 2016 at 5:25 PM, Denis Magda
>>>>             <dmagda@gridgain.com <mailto:dmagda@gridgain.com>>
wrote:
>>>>
>>>>                 Mateusz,
>>>>
>>>>                 It doesn’t work for now because peerClassLoading
>>>>                 doesn’t work for objects that are stored in the
>>>>                 binary format in a cache.
>>>>                 Since starting from 1.5 BinaryMarshaller is a
>>>>                 default one all the objects are stored in a such
>>>>                 format in caches by default.
>>>>
>>>>                 If you prefer to turn off such a behavior you can
>>>>                 set IgniteConfiguration.setMarshaller(new
>>>>                 OptimizedMarshaller()) for every node and your test
>>>>                 should work as before.
>>>>
>>>>                 —
>>>>                 Denis
>>>>
>>>>>                 On 7 янв. 2016 г., at 17:09, mp <mjjp00@gmail.com
>>>>>                 <mailto:mjjp00@gmail.com>> wrote:
>>>>>
>>>>>                 Hello Denis,
>>>>>
>>>>>                 Thanks a lot for your reply!
>>>>>                 Concerning point 2: does it mean that
>>>>>                 "peerClassLoading" simply does not work in 1.5?
>>>>>                 It used to work (partially) in 1.4 (details
>>>>>                 described earlier in the message thread).
>>>>>
>>>>>                 Cheers,
>>>>>                 -Mateusz
>>>>>
>>>>>
>>>>>
>>>>>                 On Thu, Jan 7, 2016 at 1:38 PM, Denis Magda
>>>>>                 <dmagda@gridgain.com <mailto:dmagda@gridgain.com>>
>>>>>                 wrote:
>>>>>
>>>>>                     Hi Mateusz,
>>>>>
>>>>>                     1. It seems that distributed cache is still
>>>>>                     *not* available in
>>>>>                     PRIVATE/ISOLATED modes. Is this correct?
>>>>>
>>>>>                     Right, it hasn't been fixed yet. I've just
>>>>>                     followed up the related discussion on the dev
>>>>>                     list. Please follow it to see the most
>>>>>                     up-to-date information
>>>>>                     http://apache-ignite-developers.2346864.n4.nabble.com/Fwd-Distributed-queue-problem-with-peerClassLoading-enabled-tp4521p6440.html
>>>>>
>>>>>                     2. When I run my simple test code in the
>>>>>                     default SHARED mode (the same as
>>>>>                     specified in
>>>>>                     https://issues.apache.org/jira/browse/IGNITE-1823
>>>>>                     jira issue),
>>>>>                     I still get an error. However the cause
>>>>>                     exception seems to be different.
>>>>>                     Please see attached server log.
>>>>>
>>>>>                     The reason is that there is an attempt to
>>>>>                     deserialize a binary object stored on a server
>>>>>                     node and the server node doesn't have object's
>>>>>                     class definition in its class path.
>>>>>                     I've opened a ticket
>>>>>                     https://issues.apache.org/jira/browse/IGNITE-2339
>>>>>
>>>>>                     As a workaround you can put a class definition
>>>>>                     on server's class path and the problem will
>>>>>                     disappear.
>>>>>
>>>>>                     Regards,
>>>>>                     Denis
>>>>>
>>>>>                     On 1/7/2016 1:30 PM, mjjp wrote:
>>>>>
>>>>>                         Hello,
>>>>>
>>>>>                         I have just downloaded 1.5.0-final to
>>>>>                         check if my problem has been resolved.
>>>>>                         Either I'm doing something wrong, or
>>>>>                         version 1.5 has the same behavior in
>>>>>                         this context:
>>>>>
>>>>>                         1. It seems that distributed cache is
>>>>>                         still *not* available in
>>>>>                         PRIVATE/ISOLATED modes. Is this correct?
>>>>>
>>>>>                         2. When I run my simple test code in the
>>>>>                         default SHARED mode (the same as
>>>>>                         specified in
>>>>>                         https://issues.apache.org/jira/browse/IGNITE-1823
>>>>>                         jira issue),
>>>>>                         I still get an error. However the cause
>>>>>                         exception seems to be different.
>>>>>                         Please see attached server log.
>>>>>
>>>>>                         Would you be able to check the attached
>>>>>                         log to verify if this is an expected
>>>>>                         behavior in 1.5?
>>>>>
>>>>>                         Cheers,
>>>>>                         -Mateusz
>>>>>
>>>>>                         ignite-fd14d572.log
>>>>>                         <http://apache-ignite-users.70518.x6.nabble.com/file/n2416/ignite-fd14d572.log>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                         --
>>>>>                         View this message in context:
>>>>>                         http://apache-ignite-users.70518.x6.nabble.com/Distributed-queue-problem-with-peerClassLoading-enabled-tp1762p2416.html
>>>>>                         Sent from the Apache Ignite Users mailing
>>>>>                         list archive at Nabble.com
>>>>>                         <http://nabble.com>.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>


Mime
View raw message