tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Kerber <>
Subject Re: Code performance question #2
Date Mon, 07 Aug 2006 17:29:43 GMT
Peter Crowther wrote:

>>From: David Kerber [] 
>>Is there a more efficient "split" method I could use?  Or am I 
>>completely missing the point of what you are suggesting?
>I think you've slightly missed the point.  I assume you're calling your function 5 times,
each with a different field 

>name that you want out of it.  You're then invoking string handling functions on the entire
decrypted string 5 times, each time going through the bytes to extract the piece you need.
 In the process, you traverse bytes you don't need several times.  My suggestion is that you
tokenise this *once*, and hence only pay the string-handling overhead once.  Then you get
all the parameters out of the same tokenised version.
That is essentially my question:  how do I tokenize this more 
efficiently, without doing the search for the field names?  Do you think 
it be more efficient to scan the string once and grab the field values 
as I get to each field marker?  I can do that no problem.

>However, if the next thing you do is to write this to disk, I am even more convinced that
you're optimising the wrong piece of code as the disk I/O is likely to take vastly more instructions
than the string parse.
>These may be naïve questions, but I'll ask them anyway.  How have you identified these
two pieces of code as the targets for optimisation?  What profiler have you used, under what
conditions?  What proportion of your overall CPU budget is taken by these two snippets of
code?  Is the machine CPU-bound in the first place, or is the bottleneck elsewhere?  If these
are the worst culprits in your app, I'll be very surprised.
Those are good questions, but I've already considered them over the past 
few weeks as I've been working on this.  Yes, the machine is cpu-bound.  
My 768k data line will spike the cpu to 100% and hold it above 95% until 
the transmission queues on the other end of the WAN are caught up.  I've 
seen it take up to several hours depending on how long the comms were 
down.  The HD lights are busy, but this machine has a fast RAID system, 
and watching task manager tells me that the disk subsystem seems to be 
able to keep up. 

I haven't run a profiler on this code; I've tried, but getting the 
configuration figured out has stumped me every time.  I picked out these 
particular routines (and one other I haven't posted) because of the 
principal that 90% of the cpu time is taken by 10% of the code, and 
these routines are the only loops in the entire servlet (i.e. the only 
lines of code which are executed more than once per incoming data 
line).  The servlet itself is quite small, only 431 lines, including 
comments, declares, initialization, and functional code.

Thanks for your comments...

>		- Peter

To start a new topic, e-mail:
To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message