nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Quick <>
Subject relative urls
Date Wed, 10 Sep 2008 10:53:55 GMT

It looks to me like nutch doesn't handle pages with relative links. I have checked the FAQ
and set outlinks to -1, but that makes no difference for my case.

  <description>The maximum number of outlinks that we'll process for a page.
  If this value is nonnegative (>=0), at most outlinks
  will be processed for a page; otherwise, all outlinks will be processed.

Here's an example of a relative url on my intranet home page:
<a class=cbl1 href="/general/apps/feedback.nsf/$Control/view+Feedback+-+By+Date">View
by date</a>

Is there something I should configure to handle these?

Thanks for any help.


Win New York holidays with Kellogg’s & Live Search
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message