hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enis Soztutar <enis.soz.nu...@gmail.com>
Subject Re: Bug in either ToolBase or ToolRunner or Nutch jobs
Date Wed, 10 Oct 2007 06:07:01 GMT
Hi Dennis,

ToolRunner runs the tools given to it, by first modifying the 
configuration and then passing it to the object. We have two methods 
ToolRunner#run(Tool, String[]) and ToolRunner#run(Configuration, Tool, 
String[]). The former delegates to the latter using tool.getConf(), and 
the latter instantiates a configuration if null is passed, and calls 

In most cases, hadoop inner classes prefer to use the more simplistic 
ToolRunner#run(Tool, String[]), but i bet nutch would also prefer to use 
the form :

public static void main(String argv[]) throws Exception {
    int res = ToolRunner.run(NutchConfiguration.create(), new ToolImp(), 

Hope this solves the problem.

Dennis Kubes wrote:
> I don't know if this bug is in the way ToolRunner works, ToolBase 
> works, or the way Nutch implements some of its jobs, but here is the 
> scenario.
> Many Nutch jobs (Injector for instance) use ToolBase and call the 
> doMain(Configuration conf, String[] args) method to run.  ToolBase now 
> calls ToolRunner as return ToolRunner.run(this, args);  The problem is 
> that any the configuration object passed in to toolbase is not set as 
> the conf object in Toolbase and so is essentially ignored by 
> ToolRunner.  So any nutch resources are ignored.
> The solution to this is pretty simple:
>   public final int doMain(Configuration conf, String[] args) throws 
> Exception {
>     setConf(conf);
>     return ToolRunner.run(this, args);
>   }
> But since we are moving away from ToolBase I didn't know if there is a 
> better solution for this, for example should the current Nutch jobs be 
> moved over to ToolRunner instead or should we make this simple change 
> now for compatibility as we move the jobs to ToolRunner?  Any guidance 
> is appreciated.
> Dennis Kubes

View raw message