hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henri Yandell <flame...@gmail.com>
Subject robots.txt parser
Date Mon, 01 Nov 2004 19:37:39 GMT
Does HttpClient have anything to parse a robots.txt file?

If not, would anyone be interested in http://www.osjava.org/norbert/ ?

I'd like to put it in the sandbox and thought that it would be of a
lot of interest to the HttpClient project and users.

It would need adjusting to sit on top of HttpClient as it currently
uses the JDK to download the robots.txt file itself, but that
shouldn't be very hard. Equally, HttpClient might want to, by default,
refuse to download things if it's against the robots.txt rules and
make people configure HttpClient to ignore the robots.txt to get
around it.


To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org

View raw message