nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From remi tassing <tassingr...@gmail.com>
Subject Re: Returning web page abstract with Solr
Date Wed, 04 Apr 2012 07:33:53 GMT
Are you looking for result highlighting?
http://wiki.apache.org/solr/HighlightingParameters

Remi

On Wed, Apr 4, 2012 at 3:30 PM, smooth almonds
<sir.ramsel.james@gmail.com>wrote:

> I've crawled flickr.com with Nutch successfully and am trying to return a
> highlighted abstract using Solr as the indexer/searcher. So, if I query
> "ocean" then I want to return a 20-30 word abstract from just the text of
> the web page (not the title or url) containing that query term.
>
> I've copied the Nutch schema.xml as my Solr schema.xml.
>
>
> Is the 'content' field in the schema.xml the field that indexes/stores the
> body of a web page? Or is there another field?
>
> And how do I return this field? Do I have to turn storing on? Or is there
> another way I can have Solr retrieve the abstract from the web at search
> time so that I don't have to store all that data?
>
>
> I can't find anything regarding this on the web and it seems like it would
> be a pretty popular topic.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Returning-web-page-abstract-with-Solr-tp3883400p3883400.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message