camel-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Dettinger (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CAMEL-10272) Aggregation is broken due to race condition in ParallelAggregateTask.doAggregateInternal()
Date Thu, 24 Nov 2016 18:33:59 GMT

    [ https://issues.apache.org/jira/browse/CAMEL-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15693983#comment-15693983
] 

Alex Dettinger commented on CAMEL-10272:
----------------------------------------

I don't think that we are facing such a simple race condition here. In the method _doAggregateInternal(...)_,
_result.get(...)_ and _result.set(...)_ are not run concurrently.
A single thread is used at aggregation time, in the provided example at least.

However, _oldExchange_ could be null more than once when the user custom aggregation strategy
throws a runtime exception:
{code}
public static class Router {
        public String[] routeTo() {
            return new String[] {"log:MSGROUTER_0?level=TRACE", "log:MSGROUTER_1?level=TRACE"};
        }
    }    

    @Override
    protected RouteBuilder createRouteBuilder() {
        return new RouteBuilder() {
            @Override
            public void configure() throws Exception {
                from("direct:start").
                recipientList().
                method(new Router()).
                aggregationStrategy(new AggregationStrategy(){
                    public Exchange aggregate(Exchange oldExchange, Exchange newExchange)
{
                        if (oldExchange == null) {
                            System.out.println("oldExchange is null "+newExchange.getExchangeId()+"
thread: "+Thread.currentThread());
                        }
                        System.out.println(3/0); // throws java.lang.ArithmeticException
                        return null;
                    }
                }).
                parallelProcessing().
                end();
            }
        };
    }
{code}
outputs:
{noformat}
oldExchange is null ID-alex-42036-1479994398470-0-3 thread: Thread[Camel (camel-1) thread
#2 - RecipientList-AggregateTask,5,main]
oldExchange is null ID-alex-42036-1479994398470-0-4 thread: Thread[Camel (camel-1) thread
#2 - RecipientList-AggregateTask,5,main]
{noformat}
That said, I see the following drawbacks with the current implementation:
* The AggregationStrategy class javadoc lists a single case where oldExchange could be null
whereas two exists
* This case could be difficult to debug from a camel user perspective. Camel kind of hides
the runtime exception in the custom aggregation strategy.

>From there, I see 2 paths to handle a runtime exception unwind from a custom aggregation
strategy:
* Produce an ERROR log. We then need to correct the javadoc accordingly.
* Unwind the exception to the default error handler. Note that this is what happen when parallelProcessing
is false.

Any thoughts on the right path then ?

> Aggregation is broken due to race condition in ParallelAggregateTask.doAggregateInternal()
> ------------------------------------------------------------------------------------------
>
>                 Key: CAMEL-10272
>                 URL: https://issues.apache.org/jira/browse/CAMEL-10272
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-core
>    Affects Versions: 2.16.3, 2.17.3
>         Environment: MacOS 10.11.6, JRE 1.7.0_79
>            Reporter: Peter Keller
>
> Unfortunately, I am not able to provide a (simple) unit test for comprehending the problem.
Furthermore our (complex) unit tests are not deterministic due to the root cause of the problem.
> However I tried to analyze the Camel Java code, to work out the problem. Please find
below my findings.
> h3. Problem
> The {{oldExchange}} is {{null}} more than once in the aggregator if a recipient list
is processed in parallel.
> h3. Camel route
> In my Camel route, a recipient list is worked of in parallel:
> {code}
>  from("direct:start")
>     .to("direct:pre")
>     .recipientList().method(new MyRecipientListBuilder())
>         .stopOnException()
>         .aggregationStrategy(new MyAggregationStrategy())
>         .parallelProcessing()
>     .end()
>     .bean(new MyPostProcessor());
> {code}
> Snippet of {{MyAggregationStrategy}}:
> {code}
> @Override
> @SuppressWarnings("unchecked")
> public Exchange aggregate(final Exchange oldExchange, final Exchange newExchange) {
>     if (oldExchange == null) {
>         // this is the case more than once which is not expected!
>     }
>     // ...
> {code}
> {{oldExchange}} is null more than once which is not expected and which contradicts the
contract with Camel.
> h3. Analysis
> During the processing, Camel invokes {{MulticastProcessor.process()}}. Here the result
object {{AtomicExchange}} is created which is shared during the whole processing.
> If the processing should be done in parallel (as it is the case for our route) then {{MulticastProcessor.doProcessParallel()}}
is invoked. Here one instance of {{AggregateOnTheFlyTask}} is initialized and {{aggregateOnTheFly()}}
is invoked -*asynchronously* via {{run()}}  for *every* target in the recipient list-. via
{{aggregateExecutorService.submit}} ({{aggregationTaskSubmitted}} guarantees that this is
only be done once)
> In {{aggregateOnTheFly()}}, a new instance of {{ParallelAggregateTask}} is generated,
and if aggregation is not done in parallel (as it is the case in our route), {{ParallelAggregateTask.run()}},
{{ParallelAggregateTask.doAggregate()}} (this method is synchronized), and 
> {{ParallelAggregateTask.doAggregateInternal()}} is invoked synchronously:
> {code}
> protected void doAggregateInternal(AggregationStrategy strategy, AtomicExchange result,
Exchange exchange) {
>     if (strategy != null) {
>         // prepare the exchanges for aggregation
>         Exchange oldExchange = result.get();
>         ExchangeHelper.prepareAggregation(oldExchange, exchange);
>         result.set(strategy.aggregate(oldExchange, exchange));
>     }
> } 
> {code}
> However, in {{ParallelAggregateTask.doAggregateInternal()}} there may occur a race condition
as {{result}} is shared -by every instance of {{AggregateOnTheFlyTask}}- such that {{oldExchange
= result.get()}} may be {{null}} more than once!
> Note: As a new instance of {{ParallelAggregateTask}} for every target in recipient list
is created, the {{synchronized}} method {{ParallelAggregateTask.doAggregate()}} does not prevent
the race condition!
> Does this sounds reasonably?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message