httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Wilson <and...@tees.elsevier.co.uk>
Subject Re: awaiting that host survey....
Date Wed, 01 Nov 1995 12:16:39 GMT
> lowfat# telnet www.webstuff.apple.com 80 
> Trying 17.255.216.53...  
> Connected to webstuff.apple.com.  
> Escape character is '^]'.  
> HEAD / HTTP/1.0 
> 
> HTTP/1.0 200 OK 
> Date: Wed, 01 Nov 1995 07:03:47 GMT 
> Server: Apache/0.8.8 
> Content-type: text/html
> Content-length: 1891 
> Last-modified: Tue, 31 Oct 1995 20:42:01 GMT 
> 
> 	Brian


1)	http://www.netcraft.co.uk/Survey/Reports/951101/ALL/

	be aware of the double counting problem:

	http://foo.com/
	http://www.foo.com/

	being the same site (same IP), but by the simplest applied heuristic
	for uniqueness (cat hostnames | sort | uniq) they are 'different'.
	we're looking into ways to improve the uniqueness checking, but for
	the time being we just cope.  BE AWARE OF THIS.

	our justification is that we publish all the results and so other
	people are able to make better sense of them if they desire to.

2)	http://www.netcraft.co.uk/Survey/Reports/951102/ALL/

	only includes sites which were in previous surveys.  we lose the
	more up-to-date information pertaining to sites that were announced
	after we built the first dataset some 4 months ago.  the double-counting
	issue mentioned in (1) still pertains.  BE AWARE OF THIS.

	this URL (951102) is provided as a curtesy to interested parties but,
	will more than likely be discontinued shortly. DO NOT MAKE REFERENCE
	TO IT.

3)	a much tighter heuristic for uniqueness will be applied in subsequent
	surveys.  we may also revisit old results and we-work them with the new
	heuristic in place.  *IF* we do this then the re-worked results will be
	displayed *separately*, we will not attempt to modify information which
	appears in alredy-published URLs.

4)	we are Borg, you will be assimilated.

Cheers,
Ay.


Mime
View raw message