tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <ch...@christopherschultz.net>
Subject Re: Web spiders - disabling jsessionid
Date Fri, 01 Dec 2006 15:13:04 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Leon,

Leon Rosenberg wrote:
> you believe everything you've been told ?:-)

Well, I've been told by you, and I don't believe you. ;)

> google has 3 (at least 3 known) user agents : google, mozzila with
> google-bot in the agent string (the one you sent) and another one,
> which is just Mozilla/5.0.
> 
> google uses this 3rd agent to check your site from another ip adress,
> whether you do some ugly seo stuff, like cloacking etc.
> 
> If it detects that you deliver different content to his
> mozilla-disguised bot, your chances to be thrown out of the index are
> pretty high.

This sounds pretty plausible. Unfortunately, my empirical data suggests
otherwise. Allow me to post a portion of webalizers "top user agents"
list for a small site I maintain for the month of November:

# 	Hits 	User Agent
1 	26529 	48.64% 	Googlebot/2.1
2 	12077 	22.14% 	MSIE 6.0
3 	5285 	9.69% 	Yahoo! Slurp
4 	3353 	6.15% 	Mozilla/5.0

There are 11 more user agents which are all pretty much irrelevant. As
you can see, "googlebot" appears with a plurality of the hits (yeah,
it's not a really popular site). That's a /lot/ of hits compared to the
others. In fact, if you agree that "MSIE 6.0" is not google-in-disguise,
then it is not possible for the remaining user agent stats to sum to a
value even close to what googlebot says.

Webalizer can be configured to "collapse" different user agent strings
into one single user agent (say, anything containing MSIE into a single
MSIE in order to get an aggregate MSIE usage number). No such
aggregation is being used, here, so what you see is what you get.

If your assertion was correct, I would have expected to see a large
number (perhaps 1/3 or 1/2 of the googlebot hits) to come from "Mozilla"
as google-incognito, but that's not the case.

Can you give a reference to where you discovered this "fact"?

- -chris

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFcEZ/9CaO5/Lv0PARAuvqAJ41d+SbmskQIDH1xW5obI2f2xQWTwCfavcf
ed8ZaktgYzFpjfk2lli4vns=
=HZmV
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message