Return-Path: Delivered-To: apmail-xml-cocoon-dev-archive@xml.apache.org Received: (qmail 45430 invoked by uid 500); 3 Apr 2002 01:28:34 -0000 Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: cocoon-dev@xml.apache.org Delivered-To: mailing list cocoon-dev@xml.apache.org Received: (qmail 45419 invoked from network); 3 Apr 2002 01:28:34 -0000 Message-ID: <004501c1daaf$4d7f39c0$0c91fea9@galina> Reply-To: "Ivelin Ivanov" From: "Ivelin Ivanov" To: "Rick Jelliffe" , "Kohsuke KAWAGUCHI" Cc: , References: <002a01c1d334$4e546b40$fac88842@galina> <02d601c1d586$3bcd78a0$4bc8a8c0@AlletteSystems.com> <000f01c1d5a0$b384e800$15ca8842@galina> <007a01c1d61a$c17a3530$4bc8a8c0@AlletteSystems.com> <00ca01c1d61d$2c58d760$15ca8842@galina> <02cd01c1d79b$a8624600$4bc8a8c0@AlletteSystems.com> <005601c1d85d$4d296080$0c91fea9@galina> <006d01c1d9ea$99cf2a30$4bc8a8c0@AlletteSystems.com> Subject: Re: Abstract Schemas APIs Date: Tue, 2 Apr 2002 19:31:38 -0600 MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4807.1700 X-Mimeole: Produced By Microsoft MimeOLE V5.50.4807.1700 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Rick, I fully support extensions 1,2, 3 & 6. Actually Torsten is putting together a paper to summarize the extensions which would be of interest to our group. We plan to send it to representatives from XML Schema, Relax NG - James Clark, MSV & JARV - Kohsuke and Schematron - yourself for review. Regards, Ivelin ----- Original Message ----- From: "Rick Jelliffe" To: "Ivelin Ivanov" Cc: ; Sent: Monday, April 01, 2002 8:03 PM Subject: Abstract Schemas APIs From: "Ivelin Ivanov" > Have you been following the discussion with Kohsuke on a possible JARV > integration? > If you had a chance to see the JARV API, my source code and probably > Torsten's API, maybe you can elaborate on a possible higher level validation > API which will encompass multiple schemas. I posted some thoughts to XML-DEV at http://lists.xml.org/archives/xml-dev/200203/msg01179.html which I will be sending to the DOM WG. In general, Locators or SAXExceptions should be extended to 1) carry paths as well as file/line/column messages 2) carry HTML (or XML) text for richer messages in addition to plain text messages 3) carry some kind of user-defined status constant apart from the basic Warning etc/ types 4) carry enough information to allow repair of the document 5) carry enough information to interrogate the current parsing context(s) of the document at that failure point 6) tell which parsing/validation system generated the failure Some of these are pretty easy to do, but some like 5) would probably need a ground-up redesign, since I don't expect validators are designed to allow snapshots of their states! We have been using Xerces-J in our editor for external validation, and the problems we have with it have been 1) locator error messages for XML Schemas occur too far from the actual incident--for example, if a required element is missing the locator is for the end-tag of the parent, it seems. 2) Xerces does not let the user turn on and off different kinds of validation and WF checking depending on the users interest, enough. When checking fragments, it is useless to get error messages relating to IDREF and keyref, for example. 3) This even extends to WF checking. As a parser feature, it would be good to allow unrooted documents, or to allow truncated documents which miss out on some end-tags at the end of the document, or try to match start- and end-tags in a case-insensitive way: this would allow much flexible validation. 4) The regular expression bug has a known fix, but this has never been incorporated AFAIK. I don't see how any XML Schemas datatypes can be reliable without it. 5) When sending a non-WF document with multiple roots and the continue-after-error feature enabled, we get an out-of-memory exception, which is out of proportion to the problem that causes it. In general, there is a design question of whether the technology should impose a validation checklist on the user, where they have to attend to earlier problems first, or whether the technology allows the user to focus on particular regions of a document or areas of interest fist: for example, a user might want to get linking correct before the the metadata but the DTD requires the metadata for validity. For documents-in-progress, users should be allowed to work to their own agenda and order as much as possible; this has been a long-running problem with SGML and XML systems. For contractual exchange of finished documents, the idea of "validity" is useful. But for documents-in-progress, it can be counter productive. Instead, a more useful idea is "feasibility". For Xerces to be really useful in document production, it will need more options or features aimed at this kind of lesser validation. I am presenting a paper on this at XML 2002 in Barcelona next month, by the way, if anyone is interested: "When well-formedness is too much and validity is not enough" I hope this is some use, Cheers Rick Jelliffe www.topologi.com --------------------------------------------------------------------- To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org For additional commands, email: cocoon-dev-help@xml.apache.org