nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From remi tassing <>
Subject Re: Returning web page abstract with Solr
Date Wed, 04 Apr 2012 07:33:53 GMT
Are you looking for result highlighting?


On Wed, Apr 4, 2012 at 3:30 PM, smooth almonds

> I've crawled with Nutch successfully and am trying to return a
> highlighted abstract using Solr as the indexer/searcher. So, if I query
> "ocean" then I want to return a 20-30 word abstract from just the text of
> the web page (not the title or url) containing that query term.
> I've copied the Nutch schema.xml as my Solr schema.xml.
> Is the 'content' field in the schema.xml the field that indexes/stores the
> body of a web page? Or is there another field?
> And how do I return this field? Do I have to turn storing on? Or is there
> another way I can have Solr retrieve the abstract from the web at search
> time so that I don't have to store all that data?
> I can't find anything regarding this on the web and it seems like it would
> be a pretty popular topic.
> --
> View this message in context:
> Sent from the Nutch - User mailing list archive at

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message