nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Renaud Richardet <...@oslutions.com>
Subject Re: api.RegexURLFilterBase - Configuration Resources
Date Tue, 06 Feb 2007 22:21:06 GMT
Tobias Zahn wrote:
> Hello!
> I have written a new plugin extending the IndexingFilter and using the
> RegexURLFilterBase class.
> In the log there is this message:
>
> FATAL api.RegexURLFilterBase - Can't find resource: null
>   
in your new class CustomIndexingFilter, create a field Configuration 
conf, and implement setConf, getConf like this:

public void setConf(Configuration conf) {
    this.conf = conf;
  }

  public Configuration getConf() {
    return this.conf;
  }

and pass the conf object to RegexURLFilterBase before calling it.

RegexURLFilterBase r = new RegexURLFilter();
r.setConf(conf);
r.filter("sometext");

This should do the trick.

I assume you have setup the build configuration of your plugin 
correctly, was tricky for me ;-)

build.xml

 <!-- Build compilation dependencies -->
 <target name="deps-jar">
     ......
     <ant target="jar" inheritall="false" dir="../urlfilter-regex"/>
     <ant target="jar" inheritall="false" dir="../lib-regex-filter"/>
</target>

 <!-- Add compilation dependencies to classpath -->
 <path id="plugin.deps">
   <fileset dir="${nutch.root}/build">
     .......
     <include name="**/urlfilter-regex/*.jar" />
     <include name="**/lib-regex-filter/*.jar" />
   </fileset>
 </path>

and plugin.xml
 <requires>
      <import plugin="nutch-extensionpoints"/>
     ......    
      <import plugin="urlfilter-regex"/>
      <import plugin="lib-regex-filter"/>
   </requires>

HTH,
Renaud



> I don't know how to handle that Configuration-Objects (setConf() etc.)
> What should I do to avoid that error? Where does the
> Configuration-Object come from?
>
> TIA
> Tobias Zahn
>
>   


-- 
Renaud Richardet                                      +1 617 230 9112
my email is my first name at apache.org      http://www.oslutions.com


Mime
View raw message