xerces-j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neeraj Bajaj <Neeraj.Ba...@Sun.COM>
Subject Re: [Performance] Artifical EOFException in XMLEntityScanner.load()
Date Fri, 03 Sep 2004 06:14:31 GMT
Interestingly we identified the same problem few weeks back and  i have 
the fix available on my local machine but didn't put back because it's 
not that clean.
Currently the fix includes making some changes in 
DocumentScannerImpl.java & XMLEntityScanner based on the depth of 
entities and current entity being null.
I was looking for some thing more clean which could tell the scanner 
about the end of Document.  I will look into this again and put back the 
change in next
few days.

> I have a server app that parsers millions of smallish documents.
>
> Performance has been improved at lot by reusing XMLReaders. It's 
> pretty  good but could perhaps get better in light of the (perhaps 
> dubious?)  hints given by the profiler snippet below.
>
> Accordingly, the theory is that throwing an (artifical) EOFException 
> in  XMLEntityScanner.load() at the end of each document/entity 
> consumes  some 25% (JDK 1.5) and some 15% (JDK 1.4.2) of the total 
> execution  time, the single hottest spot in the program. Probably due 
> too the  heavy nature of exceptions and in particular  
> Throwable.fillInStackTrace(). If this can indeed be confirmed by  
> others, would it perhaps be possibly (and correct) to restructure the  
> relevant xerces internals to avoid raising artificial exceptions for  
> what appears to be normal program control flow (the documents and  
> streams are fine)?
>
> Configuration: Sun JDK 1.5 RC and Sun JDK 1.4.2, xerces CVS head 
> [never  using the JDK internal xerces which appears to be twice as 
> slow in this  case, for whatever reason]

JDK 1.5 RC contains almost latest Xerces. Could you tell what are you 
doing so that we can identify the problem and fix it ?


Thanks,
Neeraj

>
> Here is the JDK 1.5 profiler snippet (java -server -Xprof):
> -----------------------------------------------------------
>          Stub + native   Method
>  28.6%     0  +   487    java.lang.Throwable.fillInStackTrace
>  28.6%     0  +   487    Total stub
>
>   Thread-local ticks:
>   0.1%     1             Blocked (of total)
>   0.1%     2             Class loader
>   0.1%     2             Compilation
>   0.2%     3             Unknown: thread_state
>
> Flat profile of 0.01 secs (1 total ticks): DestroyJavaVM
>
>   Thread-local ticks:
> 100.0%     1             Blocked (of total)
>
>
> Global summary of 35.44 seconds:
> 100.0%  1718             Received ticks
>   0.7%    12             Received GC ticks
>   9.7%   167             Compilation
>   0.1%     2             Class loader
>   0.2%     3             Unknown code
>
> real    0m35.715s
> user    0m34.170s
> sys     0m0.190s
>
>
> Here is the JDK 1.4 profiler snippet (java -server -Xprof):
> -----------------------------------------------------------
>         Stub + native   Method
>  12.7%     4  +   239    java.lang.Throwable.fillInStackTrace
>  12.7%     4  +   239    Total stub
>
>   Runtime stub + native  Method
>   0.2%     3  +     0    _rethrow_Java
>   0.2%     3  +     0    Total runtime stubs
>
>   Thread-local ticks:
>   3.1%    61             Blocked (of total)
>   0.4%     7             Interpreter
>   0.1%     2             Compilation
>   4.9%    93             Unknown: running frame
>
>
> Flat profile of 0.00 secs (1 total ticks): DestroyJavaVM
>
>   Thread-local ticks:
> 100.0%     1             Blocked (of total)
>
>
> Global summary of 43.25 seconds:
> 100.0%  2071             Received ticks
>   3.8%    79             Received GC ticks
>   6.2%   128             Compilation
>   0.5%    10             Other VM operations
>   0.3%     7             Interpreter
>   4.5%    93             Unknown code
>
> real    0m43.517s
> user    0m42.100s
> sys     0m0.530s
>
>
>
> Trace via java -server -agentlib:hprof=cpu=samples,depth=30:
> -----------------------------------------------------------
> TRACE 300347:
>         java.lang.Throwable.fillInStackTrace(Throwable.java:Unknown  
> line)
>         java.lang.Throwable.<init>(Throwable.java:181)
>         java.lang.Exception.<init>(Exception.java:29)
>         java.io.IOException.<init>(IOException.java:28)
>         java.io.EOFException.<init>(EOFException.java:32)
>         org.apache.xerces.impl.XMLEntityScanner.load(<Unknown  
> Source>:Unknown line)
>         org.apache.xerces.impl.XMLEntityScanner.skipSpaces(<Unknown  
> Source>:Unknown line)
>          
> org.apache.xerces.impl.XMLDocumentScannerImpl$TrailingMiscDispatcher.dis 
> patch(<Unknown Source>:Unknown line)
>          
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(<Unkn 
> own Source>:Unknown line)
>         org.apache.xerces.parsers.DTDConfiguration.parse(<Unknown  
> Source>:Unknown line)
>         org.apache.xerces.parsers.DTDConfiguration.parse(<Unknown  
> Source>:Unknown line)
>         org.apache.xerces.parsers.XMLParser.parse(<Unknown  
> Source>:Unknown line)
>         org.apache.xerces.parsers.AbstractSAXParser.parse(<Unknown  
> Source>:Unknown line)
>         nu.xom.Builder.build(Builder.java:786)
>         nu.xom.Builder.build(Builder.java:569)
>         gov.lbl.dsd.firefish.trash.XMLXomBench.main(XMLXomBench.java:62)
>
>
> I guess the relevant block is:
> -----------------------------------------------------------
>
> XMLEntityScanner.load(...):
>             ...
>             if (changeEntity) {
>                 fEntityManager.endEntity();
>                 if (fCurrentEntity == null) {
>                     throw new EOFException();
>                 }
>                 // handle the trailing edges
>                 if (fCurrentEntity.position == fCurrentEntity.count) {
>                     load(0, true);
>                 }
>             }
>
>
> Comments?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Mime
View raw message