nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chip Calhoun <ccalh...@aip.org>
Subject RE: Trouble running solrindexer from Nutch 1.4
Date Thu, 08 Dec 2011 21:36:05 GMT
Thanks! It's working fine now.
________________________________________
From: Lewis John Mcgibbney [lewis.mcgibbney@gmail.com]
Sent: Thursday, December 08, 2011 8:27 AM
To: user@nutch.apache.org
Subject: Re: Trouble running solrindexer from Nutch 1.4

Thanks Tim.

In addition Chip, the tutorial has now been updated to include Tim's
comments and to cover latest Nutch 1.4.

Thanks

Lewis

On Wed, Dec 7, 2011 at 10:45 PM, Tim Pease <tim.pease@gmail.com> wrote:

>
> On Dec 7, 2011, at 3:17 PM, Chip Calhoun wrote:
>
> > This is probably just down to my not waiting for a 1.4 tutorial, but
> here goes. I've always used the following two commands to run my crawl and
> then index to Solr:
> > # bin/nutch crawl urls -dir crawl -depth 1 -topN 500000
> > # bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb
> crawl/linkdb crawl/segments/*
> >
> > In 1.3 that works great. But in 1.4, when I run Solrindex I get this:
> > # bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb
> crawl/linkdb crawl/segments/*
> > SolrIndexer: starting at 2011-12-07 17:09:58
> > org.apache.hadoop.mapred.InvalidInputException: Input path does not
> exist: file:
> /C:/apache/apache-nutch-1.4/runtime/local/crawl/linkdb/crawl_fetch
> > Input path does not exist:
> file:/C:/apache/apache-nutch-1.4/runtime/local/crawl/linkdb/crawl_parse
> > Input path does not exist:
> file:/C:/apache/apache-nutch-1.4/runtime/local/crawl/linkdb/parse_data
> > Input path does not exist:
> file:/C:/apache/apache-nutch-1.4/runtime/local/crawl/linkdb/parse_text
> >
> > Sure enough, those directories don't exist. But they didn't exist in 1.3
> either. What am I missing?
> >
>
> The call signature for running the solrindex has changed. The linkdb is
> now optional, so you need to denote it with a "-linkdb" flag on the command
> line.
>
> bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb
> crawl/linkdb crawl/segments/*
>
> Blessings,
> TwP




--
*Lewis*

Mime
View raw message