hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Parks" <davidpark...@yahoo.com>
Subject How to submit Tool jobs programatically in parallel?
Date Fri, 14 Dec 2012 04:39:05 GMT
I'm submitting unrelated jobs programmatically (using AWS EMR) so they run
in parallel.

I'd like to run an s3distcp job in parallel as well, but the interface to
that job is a Tool, e.g. ToolRunner.run(...).

ToolRunner blocks until the job completes though, so presumably I'd need to
create a thread pool to run these jobs in parallel.

But creating multiple threads to submit concurrent jobs via ToolRunner,
blocking on the jobs completion, just feels improper. Is there an

View raw message