xerces-j-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ne...@ca.ibm.com
Subject Re: Schema key and unique contraints VERY slow
Date Fri, 15 Nov 2002 15:40:57 GMT
Hi Eric,

You'll definitely want to move to 2.2.1; a fair bit of work has been done
on improving performance (and conformance!) of identity constraint support
since 2.0.2.

That said, I can't say I'm shocked that you're not seeing good performance.
If you have N distinct keys in your document, Xerces will take O(N^2)
operations to prove it to itself; I daresay something could be done on this
front.  But our goal is to optimize performance for the case in which ID
constraints aren't used, since they seem to be relatively uncommon; so any
change to make them perform better would have to live within these
constraints.

You're the first I know of to report this kind of problem.  So I guess you
could file this as a bug, but I can't promise how fast it'd be
investigated.  As I always say, patches welcome!  :)

Cheers,
Neil
Neil Graham
XML Parser Development
IBM Toronto Lab
Phone:  905-413-3519, T/L 969-3519
E-mail:  neilg@ca.ibm.com




|---------+-------------------------------->
|         |           Eric_Schwarzenbach@Cl|
|         |           asswell.com          |
|         |                                |
|         |           11/14/2002 09:04 PM  |
|         |           Please respond to    |
|         |           xerces-j-user        |
|         |                                |
|---------+-------------------------------->
  >---------------------------------------------------------------------------------------------------------------------------------------------|
  |                                                                                      
                                                      |
  |       To:       xerces-j-user@xml.apache.org                                         
                                                      |
  |       cc:                                                                            
                                                      |
  |       Subject:  Schema key and unique contraints VERY slow                           
                                                      |
  |                                                                                      
                                                      |
  |                                                                                      
                                                      |
  >---------------------------------------------------------------------------------------------------------------------------------------------|




Xerces2 (I'm using the freshly downloaded 2.0.2) seems to be hideously slow
valdiating xs:unique and xs:key constraints. Painfully slow on a human
scale not simply a processor scale...a large document of mine that
specifies a unique id attribute on each element of a certain kind (where we
are talking 10,000 of these elements in a document of about 4 Meg ) goes
from taking a few seconds to validate without these constraints to several
minutes with them. This happens with both SAX and DOM parsing.

I expect there to be some processing cost to using this feature but this is
fairly ridiculous...doing similar checking in my own java code which is
using the parser takes nowhere nearly as long (in fact building indexes on
the entire document along with it takes nowhere nearly as long). Something
would seem to be seriously amiss, unless I'm out to lunch with regard to my
usage...

An example of my usage is something like:

            <xs:unique name="makeElemUnique">
                  <xs:selector xpath=".//myns:elem"/>
                  <xs:field xpath="@id"/>
            </xs:unique>

This is defined within the scope of the root element which can (will)
contain many of these elems at many different levels.

Should I file this as a bug?

Eric





---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Mime
View raw message