tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: Tomcat threads, II
Date Wed, 12 Nov 2008 23:08:05 GMT
Chris, Chuck and others,

many thanks for taking the time to educate me (on both "Tomcat threads" 
I got lots of information and tips, which will be useful now or later. 
I'll now go sift through them again. At least now I have an idea where 
to start.

About the fact that my hardware sucks, for Java : I know.
On the other hand, that machine is a good filter against programs' and 
programmer's hubris, particularly Java ones.  If it runs there, it will 
run anywhere kind of thing, and I don't need 50 fake clients to stress 
it out (obviously).

On the other hand again, on the same machine I have a text search and 
retrieval application that can sift through a full-text index of 100,000 
documents (1 Gb of text) and retrieve the ones I want in couple of 
seconds. It has a 10 Mb memory footprint.  That's why the 500 Mb 
footprint of Tomcat (with the app) and the 5 minute delay in starting 
the app over 25 Mb of XML so struck me.

I also have learned (separately, and confirmed here several times) that 
XML parsing is a hog, and that is not only in Java.  Particularly the 
DOM-style of parsing exhibits exponential time behaviour in relation to 
document size.  Large text fields are absolute killers, and making them 
CDATA only partly alleviates that.

One can always throw more hardware at things, and sometimes it may be 
cheaper than trying to over-optimise.  But some applications out there 
will kill any hardware.  It is sometimes surprisingly easy to gain a 
factor 2 with little investment though, and if that means halving the 
number of servers and their attendant care and paraphernalia, it's still 
worth it for us.  Even when they are virtual.

Our main business is processing documents, text-intensive, that's why I 
am interested. Gaining 5 seconds in processing a document counts, if 
you're processing thousands per day.  For a user with his finger on the 
mouse button too, there is a lot of difference between 1 and 3 seconds.
A note here of one of you regarding substrings in Java has particularly 
caught my interest. And I'll go check if the XML parser in that 
application could be replaced by a newer version maybe.
One alternative to XML in feeding that application with data is CSV 
files (the text version of spreadsheet).  I had discarded it until now 
as old-fashioned, "passé", limited etc.. XML is so much more "in". But I 
am having second thoughts now, and I will give it a try.

When I started in this business, 64 Kb was a nice quantity of memory to 
program in, and quite expensive too. I created and ran a payroll 
application for a 1,000 people company in there.  This Java app looks a 
lot cuter than the payroll did, but 500 Megabyte of memory for one 
single Tomcat app, mmm.  Some reflexes remain for a lifetime.


To start a new topic, e-mail:
To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message