nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: RSS-fecter and index individul-how can i realize this function
Date Fri, 02 Feb 2007 18:19:29 GMT
Gal Nitzan wrote:
> IMHO the data that is needed i.e. the data that will be fetched in the next fetch process
is already available in the <item> element. Each <item> element represents one
web resource. And there is no reason to go to the server and re-fetch that resource.

Perhaps ProtocolOutput should change.  The method:

   Content getContent();

could be deprecated and replaced with:

   Content[] getContents();

This would require changes to the indexing pipeline.  I can't think of 
any severe complications, but I haven't looked closely.

Could something like that work?

Doug

Mime
View raw message