nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SebaZ <sebastian.zaborow...@gmail.com>
Subject HTTP REFERER is missing
Date Wed, 06 Jun 2012 11:36:24 GMT
I have succesfully implemented NUTCH as crawler for SOLR index on 
http://szukaj.ug.edu.pl http://szukaj.ug.edu.pl  site. But there is some
problem with HTTP REFERER. Nutch is not sending referer header when crawling
sites. 

Is it possible to order NUTCH to send referer header on request?

Scenario:
1. Nutch open www.domain.pl
2. Nutch founds www.domain.pl/abcd.pdf link.
3. Nutch requested www.domain.pl/abcd.pdf but without
HTTP_REFERER=www.domain.pl

--
View this message in context: http://lucene.472066.n3.nabble.com/HTTP-REFERER-is-missing-tp3987967.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Mime
View raw message