tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryce Nesbitt <>
Subject Re: Web spiders - disabling jsessionid
Date Sun, 03 Dec 2006 07:45:58 GMT
>As you may know url rewriting feature is not a nice thing when spiders
>come to index your site -

I'm having such trouble with JSESSIONID and search engines Google,
Accoona, Alexa and Exalead.

My approach was to contact each firm, and ask them fix thier bugs.  Each
crawls each page of my site 5-50 times a day, presumably because they don't
understand the ";" in the URL.  For example: - - [28/Nov/2006:16:05:36 -0800] "GET
/;jsessionid=68B86DFF8E4A8597B210531C3431965D HTTP/1.1" 200
17195 "-" "Exabot/3.0" - - [28/Nov/2006:16:17:30 -0800] "GET
/;jsessionid=0621414681C92E1A00A9428A7800AC30 HTTP/1.1" 200
17195 "-" "Exabot/3.0" - - [28/Nov/2006:17:00:36 -0800] "GET
/;jsessionid=0079FCD91ED8E5B86902228D285CCEEF HTTP/1.1" 200
17195 "-" "Exabot/3.0" - - [28/Nov/2006:20:41:50 -0800] "GET
/;jsessionid=DE9B61384D3D75DE9EB38A21F066E433 HTTP/1.1" 200
17195 "-" "Exabot/3.0"

While I have received responses from all, and three have promised action,
none has fixed the issue to date.  So I tried to fix it myself.

This was easy under Resin.  But under Tomcat I have had no luck.

I added <% <at>  page session="false" %> each of my struts tiles
templates, but I still see some session ID's in the logs.  Now I'm
confused.  My application does not care about sessions.  I can't force a
";" based jsessionid to show in Firefox.   Yet somehow the darn robots
(and some other browsers) are seeing them.  What's up? - - [02/Dec/2006:23:29:20 -0800] "GET
ng;jsessionid=D5912D8983A86FF2BCF3381DB454D54A HTTP/1.1" 200 4960
"" "Mozilla/5.0 (Macintosh; U; Intel Mac OS
X; en) AppleWebKit/418 .9 (KHTML, like Gecko) Safari/419.3"

The website in question is


To start a new topic, e-mail:
To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message