incubator-s4-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Gómez Ferro (JIRA) <j...@apache.org>
Subject [jira] [Commented] (S4-87) Checkpointing: recovery : avoid rejections upon fetching
Date Tue, 24 Jul 2012 15:51:34 GMT

    [ https://issues.apache.org/jira/browse/S4-87?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421494#comment-13421494
] 

Daniel Gómez Ferro commented on S4-87:
--------------------------------------

I managed to reproduce it with a fetch task that times out, so subsequent tasks are rejected.

The proposed patch fixes it, +1
                
> Checkpointing: recovery : avoid rejections upon fetching
> --------------------------------------------------------
>
>                 Key: S4-87
>                 URL: https://issues.apache.org/jira/browse/S4-87
>             Project: Apache S4
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Matthieu Morel
>            Assignee: Matthieu Morel
>
> Tests pass fine on macosx with jdk 1.6.0_33 but fail on ubuntu with the same jdk version
(oracle).
> Here is the stacktrace: (I added some logging to see the error)
> {code}
> java.util.concurrent.RejectedExecutionException: null
> 	at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768)
~[na:1.6.0_33]
> 	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) ~[na:1.6.0_33]
> 	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) ~[na:1.6.0_33]
> 	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:92)
~[na:1.6.0_33]
> 	at org.apache.s4.core.ft.SafeKeeper.fetchSerializedState(SafeKeeper.java:239) ~[main/:na]
> 	at org.apache.s4.core.ProcessingElement.recover(ProcessingElement.java:759) [main/:na]
> 	at org.apache.s4.core.ProcessingElement.handleInputEvent(ProcessingElement.java:411)
[main/:na]
> 	at org.apache.s4.core.Stream.run(Stream.java:299) [main/:na]
> 	at java.lang.Thread.run(Thread.java:662) [na:1.6.0_33]
>  [words seen stream] ERROR org.apache.s4.core.ProcessingElement - Cannot fetch serialized
stated for [org.apache.s4.wordcount.WordCounterPE/doobie
> {code}
> This could be due to the fact that we use a handoff queue, though it is not clear to
me.
> Anyway, since there may be parallel recovery request from different prototypes, it may
be more adequate to use a bounded queue, with the possibility to use multiple threads for
the fetch operations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message