nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Nutch Wiki] Trivial Update of "RunNutchInEclipse" by LewisJohnMcgibbney
Date Wed, 03 Aug 2011 12:07:54 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "RunNutchInEclipse" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/RunNutchInEclipse?action=diff&rev1=20&rev2=21

+ ##Original credits: RenaudRichardet
+ 
  = RunNutchInEclipse =
  This page acts as a resource for working with Nutch from within the Eclipse IDE. It is intended
to provide a comprehensive beginning resource for the configuration, building, crawling and
debugging of Nutch 1.3 in the above context.
  
@@ -93, +95 @@

  Yes, Nutch and Eclipse can be a difficult companionship sometimes ;-)
  
  === eclipse: Cannot create project content in workspace ===
- The nutch source code must be out of the workspace folder. My first attemp was download
the code with eclipse (svn) under my workspace. When I try to create the project using existing
code, eclipse don't let me do it from source code into the workspace. I use the source code
out of my workspace and it work fine.
+ The Nutch source code must be out of the workspace folder. Alternatively you can download
the code with eclipse (svn) under your workspace rather than try to create the project using
existing code, eclipse sometimes doesn't let you do it from source code into the workspace.
  
- === plugin dir not found ===
+ === plugin directory not found ===
- Make sure you set your plugin.folders property correct, instead of using a relative path
you can use a absoluth one as well in nutch-defaults.xml or may be better in nutch-site.xml
+ Make sure you set your plugin.folders property correct, instead of using a relative path
you can use a absolute one as well in nutch-default.xml or even better in nutch-site.xml.
Ideally all efforts should be made to keep nutch-defult.xml completely intact.
  
  {{{
  <property>
    <name>plugin.folders</name>
-   <value>/home/....../nutch-0.8/src/plugin</value>
+   <value>/home/....../nutch-1.3/src/plugin</value>
  }}}
+ 
  === No plugins loaded during unit tests in Eclipse ===
  During unit testing, Eclipse ignored conf/nutch-site.xml in favor of src/test/nutch-site.xml,
so you might need to add the plugin directory configuration to that file as well.
- 
- === Unit tests work in eclipse but fail when running ant in the command line ===
- Suppose your unit tests work perfectly in eclipse, but each and everyone fail when running
'''ant test''' in the command line - including the ones you haven't modified.   Check if you
defined the '''plugin.folders''' property in hadoop-site.xml. In that case, try removing it
from that file and adding it directly to nutch-site.xml
- 
- Run '''ant test''' again.  That should have solved the problem.
- 
- If that didn't solve the problem, are you testing a plugin?  If so, did you add the plugin
to the list of packages in plugin\build.xml, on the test target?
  
  === classNotFound ===
   * open the class itself, rightclick
   * refresh the build dir
  
- === missing org.farng and com.etranslate ===
- You may have problems with some imports in parse-mp3 and parse-rtf plugins. Because of incompatibility
with apache licence they were left from sources. You can find it here:
+ === debugging Hadoop classes ===
+ Sometimes (fairly often) it makes sense to also have the Hadoop classes available during
debugging. This should really second nature as Nutch heavily relies upon the underlying Hadoop
infrastructure. Therefore you can check out (svn) the Hadoop sources into your Eclipse IDE
and combine to debug this way. You can:
+   * Checkout the Hadoop version that should be used within Nutch 1.3
+   * configure a Hadoop project similar to the Nutch project within your Eclipse IDE
+   * add the Hadoop project as a dependent project of Nutch project
+   * you can now also set break points within Hadoop classes like inputformat implementations
etc.
  
- http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/
- 
- http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/
- 
- You need to copy jar files into plugin "lib" path and refresh the project.
- 
- === debugging hadoop classes ===
-  . Sometime it makes sense to also have the hadoop classes available during debugging. So,
you can check out the Hadoop sources on your machine and add the sources to the  hadoop-xxx.jar.
Alternatively, you can:
-   * Remove the hadoopXXX.jar from your classpath libraries
-   * Checkout the hadoop brunch that is used within nutch
-   * configure a hadoop project similar to the nutch project within your eclipse
-   * add the hadoop project as a dependent project of nutch project
-   * you can now also set break points within hadoop classes lik inputformat implementations
etc.
- 
- Original credits: RenaudRichardet
- 

Mime
View raw message