crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao Shi (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CRUNCH-172) Refine synchronization mechanism in CrunchJobControl
Date Sun, 10 Mar 2013 16:05:12 GMT

     [ https://issues.apache.org/jira/browse/CRUNCH-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chao Shi updated CRUNCH-172:
----------------------------

    Attachment: crunch-172.patch

Remove background thread from CrunchJobControl and let it called by the monitor thread in
MRExecutor
    
Beside this, there are some small changes:
- Use exponential backoff when query job status. This makes local integration tests run much
faster on hadoop2.
- Remove suspend/resume support, because it is currently not used and makes synchronization
complex.

                
> Refine synchronization mechanism in CrunchJobControl
> ----------------------------------------------------
>
>                 Key: CRUNCH-172
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-172
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6.0
>            Reporter: Chao Shi
>            Assignee: Josh Wills
>         Attachments: crunch-172.patch
>
>
> Currently CrunchJobControl uses a runnerState to synchronize its background loop and
client calls (e.g. stop). This is not sufficient. Jenkins reports a failure after CRUNCH-156
is checked in.
> MRExecutor does the following in its monitorLoop:
> {code}
>       Thread controlThread = new Thread(control);
>       controlThread.start();
>       while (killSignal.getCount() > 0 && !control.allFinished()) {
>         killSignal.await(1, TimeUnit.SECONDS);
>       }
>       control.stop();
> {code}
> And how CrunchJobControl works:
> {code}
>   public void stop() {
>     this.runnerState = ThreadState.STOPPING;
>   }
>   public void run() {
>     this.runnerState = ThreadState.RUNNING;
>     while (true) {
>     ...
>   }
> {code}
> So it is possible to have stop() called before run() called in the other thread. Then
MRExecutor thinks everything has been stopped and start to do clean up work, while CrunchJobControl
is continue to submit new jobs. Because the clean up work is done, the newly submitted job
will complain FileNotFound.
> I think a solution is to remove background thread in CrunchJobControl and let MRExecutor
to call it periodically.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message