camel-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Claus Ibsen (JIRA)" <>
Subject [jira] [Commented] (CAMEL-11750) Camel route with multicast (parallel) generate huge CPU load
Date Fri, 08 Sep 2017 08:22:01 GMT


Claus Ibsen commented on CAMEL-11750:

Thanks for reporting and diving int the JDK code and showing CPU graphs. Yeah it does smell
like the JDK with the no-delay would keep looping in that logic

> Camel route with multicast (parallel) generate huge CPU load
> ------------------------------------------------------------
>                 Key: CAMEL-11750
>                 URL:
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-core
>            Reporter: Nazar Vishka
>            Priority: Critical
> I've created simple [Spring Camel Route application|]
that can be used for issue reproducing. Here we have very simple route:
> {code}
>     public void configure() throws Exception {
>         from("direct:start").routeId("start")
>                 .multicast().parallelProcessing()
>                 .to("direct:very-long-task", "direct:long-task")
>                 .end();
>         from("direct:long-task").routeId("long-task")
>                 .log("Started long-task")
>                 .process(exchange -> Thread.sleep(5000))
>                 .log("Finished long-task")
>                 .end();
>         from("direct:very-long-task").routeId("very-long-task")
>                 .log("Started very-long-task")
>                 .process(exchange -> Thread.sleep(35000))
>                 .log("Finished very-long-task")
>                 .end();
>     }{code}
> From our main route 'start' we are starting in parallel two sub-routes: 'long-task' and
'very-long-task'. They are just doing something for some period of time and do not generate
any load to the system.
> But I found that when one task finished earlier than other one, route start to make a
huge CPU load. Here you can see a CPU usage during executiong of Camel route that was mentioned
earlier (after finishing of 'long-task' usage of CPU uncreased from 0 to 12.5%):
> !!
> Screenshot was made when I was running route on my Windows PC with 4 physical CPU cores
+ 4 HT. On Unix systems we found that after end of 'long-task' it used 100% of one core till
the end of work.
> One more interesting thing that i've found is that the main load on the system was generated
by the thread MulticastProcessor-AggregateTask that was spending a lot of time in the method
java.util.concurrent.locks.AbstractQueuedSynchronizer.ConditionObject#awaitNanos that was
called from java.util.concurrent.DelayQueue#poll(long, java.util.concurrent.TimeUnit):
> {code}"Camel (camel-1) thread #2 - MulticastProcessor-AggregateTask" #29 daemon prio=5
os_prio=0 tid=0x00000000215e3000 nid=0x7a0 runnable [0x0000000022eaf000]
>    java.lang.Thread.State: RUNNABLE
> 	at java.lang.Thread.isInterrupted(Native Method)
> 	at java.lang.Thread.interrupted(
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.checkInterruptWhileWaiting(
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(
> 	at java.util.concurrent.DelayQueue.poll(
> 	at org.apache.camel.util.concurrent.SubmitOrderedCompletionService.poll(
> 	at org.apache.camel.processor.MulticastProcessor$AggregateOnTheFlyTask.aggregateOnTheFly(
> 	at org.apache.camel.processor.MulticastProcessor$
> 	at java.util.concurrent.Executors$
> 	at
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(
> 	at java.util.concurrent.ThreadPoolExecutor$
> 	at{code}
> In DelayQueue#poll(long, java.util.concurrent.TimeUnit) we have a piece of code:
> {code}E first = q.peek();
> if (first == null) {
>  ...
> } else {
>   long delay = first.getDelay(NANOSECONDS);
>     ...
>   long timeLeft =  available.awaitNanos(delay) {code}
> During debugging I found that E first is object of class [org.apache.camel.util.concurrent.SubmitOrderedCompletionService.SubmitOrderFutureTask|]
and it's very interesting [getDelay(TimeUnit) method|]
always return 1. I thing that is a source of current issue:
> {code} public long getDelay(TimeUnit unit) {
>             // if the answer is 0 then this task is ready to be taken
>             return id - index.get();
>         }{code}
> So when delay is 1, AggregateTask thread is not sleeping at all and generates a huge
CPU load all the time when it is waiting for the end of the last task.
> I think that [getDelay(TimeUnit) method|]
should be somehow re-implemented.

This message was sent by Atlassian JIRA

View raw message