nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Quick <edwardqu...@hotmail.com>
Subject relative urls
Date Wed, 10 Sep 2008 10:53:55 GMT

It looks to me like nutch doesn't handle pages with relative links. I have checked the FAQ
and set outlinks to -1, but that makes no difference for my case.

<property>
  <name>db.max.outlinks.per.page</name>
  <value>-1</value>
  <description>The maximum number of outlinks that we'll process for a page.
  If this value is nonnegative (>=0), at most db.max.outlinks.per.page outlinks
  will be processed for a page; otherwise, all outlinks will be processed.
  </description>
</property>


Here's an example of a relative url on my intranet home page:
<a class=cbl1 href="/general/apps/feedback.nsf/$Control/view+Feedback+-+By+Date">View
by date</a>

Is there something I should configure to handle these?

Thanks for any help.

Ed.




_________________________________________________________________
Win New York holidays with Kellogg’s & Live Search
http://clk.atdmt.com/UKM/go/111354033/direct/01/
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message