lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <>
Subject Re: Pushing content to Solr from Nutch
Date Fri, 11 Apr 2014 04:52:26 GMT
Does your Solr schema match the data output by nutch? It’s up to you to create a Solr schema
that matches the output of nutch – read up on the nutch doc for that info. Solr doesn’t
define that info, nutch does.

-- Jack Krupansky

From: Xavier Morera 
Sent: Thursday, April 10, 2014 12:58 PM
Subject: Pushing content to Solr from Nutch


I have followed several Nutch tutorials - including the main one
- to crawl sites (which works, I can see in the console as the pages get crawled and the directories
built with the data) but for the life of me I can't get anything posted to Solr. The Solr
console doesn't even squint, therefore Nutch is not sending anything.

This is the command that I send over that crawls and in theory should also post
bin/crawl urls/seed.txt TestCrawl http://localhost:8983/solr 2

But I found that I could also use this one when it is already crawled
bin/nutch solrindex http://localhost:8983/solr crawl/crawldb crawl/linkdb crawl/segments/*

But no luck.

This is the only thing that called my attention but I read that by adding the property below
would work but doesn't work.
No IndexWriters activated - check your configuration

This is the property

Any idea? Apache Nutch 1.8 running Java 1.6 via Cygwin on Windows.


Xavier Morera

CR: +(506) 8849 8866
US: +1 (305) 600 4919 
skype: xmorera
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message