hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Dechoux <decho...@gmail.com>
Subject Re: Is Hadoop's TooRunner thread-safe?
Date Fri, 21 Mar 2014 08:26:01 GMT
JIRA, test, patch and review? I am sure the community would welcome it. And
if you don't, well, it is unlikely to be appear soon into hadoop trunk.

Bertrand


On Fri, Mar 21, 2014 at 12:49 AM, Something Something <
mailinglists19@gmail.com> wrote:

> Confirmed that ToolRunner is NOT thread-safe:
>
> *Original code (which runs into problems):*
>
>   public static int run(Configuration conf, Tool tool, String[] args)
>     throws Exception{
>     if(conf == null) {
>       conf = new Configuration();
>     }
>     GenericOptionsParser parser = new GenericOptionsParser(conf, args);
>     //set the configuration back, so that Tool can configure itself
>     tool.setConf(conf);
>
>     //get the args w/o generic hadoop args
>     String[] toolArgs = parser.getRemainingArgs();
>     return tool.run(toolArgs);
>   }
>
>
>
>
>
> *New code (which works):*
>
>     public static int run(Configuration conf, Tool tool, String[] args)
>             throws Exception{
>         if(conf == null) {
>             conf = new Configuration();
>         }
>         GenericOptionsParser parser = getParser(conf, args);
>
>         tool.setConf(conf);
>
>         //get the args w/o generic hadoop args
>         String[] toolArgs = parser.getRemainingArgs();
>         return tool.run(toolArgs);
>     }
>
>     private static *synchronized *GenericOptionsParser
> getParser(Configuration conf, String[] args) throws Exception {
>         return new GenericOptionsParser(conf, args);
>     }
>
>
>
>
>
>
> On Wed, Mar 19, 2014 at 10:15 AM, Something Something <
> mailinglists19@gmail.com> wrote:
>
>> I would like to trigger a few Hadoop jobs simultaneously.  I've created
>> a pool of threads using Executors.newFixedThreadPool.  Idea is that if
>> the pool size is 2, my code will trigger 2 Hadoop jobs at the same exact
>> time using 'ToolRunner.run'.  In my testing, I noticed that these 2
>> threads keep stepping on each other.
>>
>> When I looked under the hood, I noticed that ToolRunner creates
>> GenericOptionsParser which in turn calls a static method
>> 'buildGeneralOptions'.  This method uses 'OptionBuilder.withArgName'
>> which uses an instance variable called, 'argName'.  This doesn't look
>> thread safe to me and I believe is the root cause of issues I am running
>> into.
>>
>> Any thoughts?
>>
>
>

Mime
View raw message