manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erlend GarĂ¥sen <e.f.gara...@usit.uio.no>
Subject Crawling just one particular page from a host
Date Tue, 14 May 2013 11:45:41 GMT

I just figured out that even though "Include only hosts matching seeds?" 
is enabled, the web crawler continues to fetch everything from the host 
"www.ibsen.uio.no" if I have placed the following in the seed list:
http://www.ibsen.uio.no/forside.xhtml

I expected that only this page would be crawled, but that does not seem 
to be the case.

Erlend

-- 
Erlend GarĂ¥sen
Center for Information Technology Services
University of Oslo
P.O. Box 1086 Blindern, N-0317 OSLO, Norway
Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050

Mime
View raw message