nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Xue <andyxuey...@gmail.com>
Subject Re: Run Nutch Crawl in Eclipse
Date Tue, 10 Apr 2012 12:03:05 GMT
Ferdy:
Thanks for the heads up, and sorry for spamming the dev channel.

Lewis:
Thanks for the reply.
However as far as I know, I don't have to set solrUrl unless I want to
index using solr.

The thing here is: Using the same configuration, I can run manually using
the crawl script at" $NUTCH_HOME/runtime/local/bin/nutch", but it doesn't
work if I try to run the Crawl class inside Eclipse. Do you reckon that the
exception is related to solrUrl?
==============================================================================
solrUrl is not set, indexing will be skipped...
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
 at org.apache.nutch.crawl.Injector.inject(Injector.java:217)
at org.apache.nutch.crawl.Crawl.run(Crawl.java:127)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)
==============================================================================



None of you had the problem before? Did you all follow the
"RunNutchInEclipse" tutorial and it just worked?

Appreciate it.
Andy

On 10 April 2012 20:05, Lewis John Mcgibbney <lewis.mcgibbney@gmail.com>wrote:

> Hi Andy,
>
> On Tue, Apr 10, 2012 at 2:37 AM, Andy Xue <andyxueyuan@gmail.com> wrote:
>
> >
> > solrUrl is not set, indexing will be skipped...
> >
>
> If you wish Nutch to do end to end crawling to indexing respectively you
> need to tell it so.  In crawl configuration set the -solrUrl parameter to
> point to your solr server.
>
> Lewis
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message