hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrian Sutton <adr...@intencha.com>
Subject Re: Cannot use HttpClient to search google
Date Sun, 11 Jul 2004 07:25:06 GMT
> <html><head><title>403 Forbidden</title>....
> <blockquote><H1>Forbidden</H1>Your client does not have permission
to 
> get URL <code>/search?hl=en&amp;ie=UTF-8&amp;q=sql+server+trace</code>

> from this server.  (Client IP address: xx.xx.xx.xx)<br><br>Please see 
> Google's Terms of Service posted at 
> http://www.google.com/terms_of_service.html
> ....
>
> I guess the main reason is google uses akamai's network to distribute 
> loads.

No it means you should read the terms of service, specifically the part 
about not using "screen-scraping" techniques to programatically perform 
searches (which is what you're trying to do).  You should use the 
Google SOAP search service instead as it will make your life a lot 
easier.

It would not be appropriate to discuss ways around the technical 
limitations Google uses to enforce their terms of service on an Apache 
Software Foundation mailing list.

The particular section of the Google ToS that I believe applies here is 
listed under the "Personal Use Only" and "No Automated Querying" 
headings.

Information on Google's SOAP APIs is available at 
http://www.google.com.au/apis/ (note they also have terms of service)

Finally, sorry if this seems abrupt, it is important for the ASF to 
clearly not support use of their products in any way that may cause 
legal trouble.  If you feel you are following the terms of services for 
Google and I've missed something then my apologies.

Regards,

Adrian Sutton.

----------------------------------------------
Intencha "tomorrow's technology today"
Ph: 38478913 0422236329
Suite 8/29 Oatland Crescent
Holland Park West 4121
Australia QLD
www.intencha.com

Mime
View raw message