nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chip Calhoun <ccalh...@aip.org>
Subject Trouble running solrindexer from Nutch 1.4
Date Wed, 07 Dec 2011 22:17:08 GMT
This is probably just down to my not waiting for a 1.4 tutorial, but here goes. I've always
used the following two commands to run my crawl and then index to Solr:
# bin/nutch crawl urls -dir crawl -depth 1 -topN 500000
# bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb crawl/linkdb crawl/segments/*

In 1.3 that works great. But in 1.4, when I run Solrindex I get this:
# bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb crawl/linkdb crawl/segments/*
SolrIndexer: starting at 2011-12-07 17:09:58
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file: /C:/apache/apache-nutch-1.4/runtime/local/crawl/linkdb/crawl_fetch
Input path does not exist: file:/C:/apache/apache-nutch-1.4/runtime/local/crawl/linkdb/crawl_parse
Input path does not exist: file:/C:/apache/apache-nutch-1.4/runtime/local/crawl/linkdb/parse_data
Input path does not exist: file:/C:/apache/apache-nutch-1.4/runtime/local/crawl/linkdb/parse_text

Sure enough, those directories don't exist. But they didn't exist in 1.3 either. What am I
missing?

Thanks,
Chip

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message