hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 36932] - httpclient not able to download certain urls
Date Wed, 05 Oct 2005 17:52:14 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=36932>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=36932





------- Additional Comments From ajd_765@yahoo.com  2005-10-05 19:52 -------

Given that there's a ton of crap HTML out there in the wild,
what is the recommended way of handling URLs that don't conform,
yet could be meaninfully used anyways?

>From the crawler perspective, we want to "eat crap and poo gold"
(with apologies to Postel).  Any recommendations would be much appreciated.

Thanks for the hard work and great library!

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


Mime
View raw message