ws-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benson Margulies <bimargul...@gmail.com>
Subject Re: How many XML Schema libraries at ASF is too many XML Schema Libraries?
Date Mon, 06 Apr 2009 17:31:39 GMT
Hypothetically, one could build at the level that Xerces works, and then
'serialize' as XSD, or RelaxNG (shades of MSV) or whatever. Practically, I
have my doubts. Now, the Eclipse question is worth some persuit. Mayhap
whatever data model they are using has some advantages. Though Eclipse has
shown me a propensity to trash XSD documents when edited with their GUI.

In a sense, there are three levels:

Bottom: Abstract data model of XML schema (small letters) for validation.
It's a model of constraints on XML, and could hypothetically serve multiple
languages. There's one in Xerces, and there's one in MSV. Maybe also in
xsom.

Middle: W3C XML Schema in particular. It has a set of objects.

Top: DOM as a representation of W3C XML Schema.

Right now, we're communicating between various parties in DOM, and that
might be the best we can or want to do.


On Mon, Apr 6, 2009 at 11:19 AM, Daniel Kulp <dkulp@apache.org> wrote:

>
> Yea, I think the data model thing is definitely the biggest issue.   Part
> of
> the goal of XmlSchema was to have a representation that could be used by a
> "GUI" or something to help create/edit schemas.   Also, the obvious "java
> first" cases where we need to convert Java objects to schema requires
> building
> up a Schema object model that can then write out the schema.   It's
> definitely
> a different use case than the xerces model.   Xerces is targeted at
> consuming
> schema (as is xsom I think) whereas XmlSchema has more capabilities for
> producing Schema.
>
> Dan
>
>
> On Mon April 6 2009 10:30:46 am Benson Margulies wrote:
> > The data model issue, I think, is the crux here. My belief is that
> > XmlSchema users need to ingest, modify, and then emit XML documents
> > containing schema, and I think these documents need to be recognizable to
> > their original authors when written back out. For example, if a user's
> > schema uses xs:include or xs:import to pull in a schema from someplace
> > else, ending up with a schema that just merges the content inline, with
> no
> > reference to the original, could be a problem. Things like attribute
> groups
> > have a role to play in creating schemas that are readable to humans.
> > <xs:annotation><xs:documentation>blah</xs:documentation></xs:annotation>
> is
> > another thing that would be pretty hard to preserve.
> >
> > A programmer working, say, with the CXF Aegis binding, can open a book on
> > Xml Schema, and find an API that corresponds to the constructs he or she
> > sees there. In the model you are describing, that person would need to
> > become familiar with the underlying model. I'm not by any means
> describing
> > this as a fatal flaw, just a consideration.
> >
> > My decision to allow XmlSchema to depend on 1.5 for my own convenience is
> > not a big deal. If we were to come up with a plan for a unified approach,
> I
> > don't see any problem with retreating in version space.
> >
> > I don't, then, see using the Xerces rep as the one-and-only rep as
> viable,
> > due to the data model issues. However, it might be that some sharing with
> > it could make various things more efficient and slimmer. Could one
> imagine
> > mapping more directly from the XmlSchema 'surface' objects to a Xerces
> > model as a more efficient way to obtain a Schema object?
> >
> > I probably need to do more reading before commenting further.
> >
> >
> >
> > On Mon, Apr 6, 2009 at 10:17 AM, Michael Glavassevich
> >
> > <mrglavas@ca.ibm.com>wrote:
> > > Benson Margulies <bimargulies@gmail.com> wrote on 04/06/2009 07:53:53
> AM:
> > > > Folks,
> > > >
> > > > I do not know of any way to communicate with the originators of
> > > > Apache XmlSchema, so I don't think it's productive to explore their
> > > > motives in retrospect. I'm more interesting in looking forward.
> > >
> > > I think it is important to understand what the API is trying to achieve
> > > and what capability its audience expects to get from it.
> > >
> > > Looking at the WS-Commons API I see at least a few things which don't
> > > naturally fit into Xerces like classes for xs:imports, xs:includes,
> > > attribute group references, elementFormDefault, etc... These constructs
> > > aren't part of the data model defined by the schema spec yet I assume
> > > there was some need to include them in the API and that there are
> > > possibly users which now depend on them being there. Xerces doesn't
> hold
> > > on to any of those things and it would be quite difficult to add them
> in
> > > now, both in the implementation and Xerces' API which is stable and
> needs
> > > to remain compatible even when we evolve it. Also, I see there's been
> > > some effort to move towards using Java 5 language features. We only
> > > recently voted to move to a minimum of JDK 1.3 and given feedback from
> > > our user community it's still going to be awhile before we would move
> > > higher than that.
> > >
> > > > Apache XmlSchema is \read-write/. One can create an entire schema,
> > > > from scratch, operating against its abstract data model. If the
> > > > Xerces model is adaptable in that direction, and the Xerces
> > > > community is amenable, then it's worth looking into extending the
> > > > Xerces implementation in this direction. If not, we've discovered,
> > > > perhaps, why we have two.
> > > >
> > > > Your email includes a couple of other implementations of which I am
> > > > ignorant, so I'm not prepared (yet) to comment on the extent to
> > > > which users of Apache XmlSchema could do just as well using those.
> > > >
> > > > --benson
> > > >
> > > >
> > > > On Sun, Apr 5, 2009 at 11:29 PM, Michael Glavassevich <
> > >
> > > mrglavas@ca.ibm.com
> > >
> > > > > wrote:
> > > >
> > > > Xerces has had its own XML Schema API [1][2] for years which has
> > > > many users. This API is implemented by Xerces-J, C++ and Perl,
> > > > providing a read-only view of the XML Schema component [3] model
> > > > which is primarily exposed through PSVI [4]. In Xerces-J the
> > > > implementation is optimized for schema validation. DOM is an
> > > > intermediate representation in building it but we throw it out at
> > > > the end. Keeping it around would bloat memory usage and doesn't fit
> > > > into the programming model which is schema component centric rather
> > > > than schema document centric.
> > > >
> > > > I was aware that WS-Commons decided to develop its own API though I
> > > > must say I never understood why given all the choices which already
> > > > existed [1][5][6][7]. I just assumed that the community had its
> > > > reasons, perhaps to support some specific scenarios which don't fit
> > > > well with the goals of these other APIs. I wouldn't have guessed
> > > > that it would have been due to ignorance of their existence. Surely
> > > > someone would have known about at least a few of these before
> > > > starting this WS-COMMONS project, right?
> > > >
> > > > Thanks.
> > > >
> > > > [1] http://www.w3.org/Submission/2004/SUBM-xmlschema-api-20040309/
> > > > [2] http://xerces.apache.org/xerces2-j/javadocs/xs/index.html
> > > > [3] http://www.w3.org/TR/xmlschema-1/#concepts-data-model
> > > > [4] http://xerces.apache.org/xerces2-
> > > > j/javadocs/xs/org/apache/xerces/xs/ItemPSVI.html
> > > > [5] http://www.eclipse.org/modeling/mdt/?project=xsd#xsd
> > > > [6] http://xmlbeans.apache.org/docs/2.4.0/reference/index.html
> > > > [7] https://xsom.dev.java.net
> > > >
> > > > Michael Glavassevich
> > > > XML Parser Development
> > > > IBM Toronto Lab
> > > > E-mail: mrglavas@ca.ibm.com
> > > > E-mail: mrglavas@apache.org
> > > >
> > > > Benson Margulies <bimargulies@gmail.com> wrote on 04/05/2009
> 06:44:24
> > >
> > > PM:
> > > > > Lawrence,
> > > > >
> > > > > Historically, that might have been the case or the intention, but
> > > > > there's nothing 'pull-y' about it today. If you feed it files, it
> > > > > builds dom and walks dom. If you feed it dom, it just walks dom.
> > > > >
> > > > > Now, I'm not an axis2 developer, I'm a cxf developer, but from
> > > > > inside the code I just can't see how it would be a problem.
> > > > >
> > > > > -benson
> > > > >
> > > > > On Sun, Apr 5, 2009 at 6:22 PM, Lawrence Mandel <
> lmandel@ca.ibm.com>
> > >
> > > wrote:
> > > > > Hi Benson,
> > > > >
> > > > > I may not be remembering correctly, but I thought that one of the
> > > > > reasons for developing XmlSchema in WS-COMMONS was to support a
> pull
> > > > > based model (Axiom). I completely agree that there should not be
> > > > > duplication of effort around these models at Apache. (Our
> collective
> > > > > time is better served solving other problems.) Do you foresee any
> > > > > issues (primarily with Axis2) with moving XmlSchema to a strictly
> > > > > DOM based model? Is this question even relevant?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Lawrence
> > > > >
> > > > >
> > > > > From:
> > > > >
> > > > > Benson Margulies <bimargulies@apache.org>
> > > > >
> > > > > To:
> > > > >
> > > > > j-dev@xerces.apache.org
> > > > >
> > > > > Cc:
> > > > >
> > > > > Daniel Kulp <dkulp@apache.org>, general@ws.apache.org
> > > > >
> > > > > Date:
> > > > >
> > > > > 04/05/2009 05:45 PM
> > > > >
> > > > > Subject:
> > > > >
> > > > > How many XML Schema libraries at ASF is too many XML Schema
> > > >
> > > >   Libraries?
> > > >
> > > > > Sent by:
> > > > >
> > > > > bimargulies@gmail.com
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Dear Xerces-J developers,
> > > > >
> > > > > At the moment, I'm the most active maintainer of Apache Xml Schema.
> > > > > This library, which lives inside the WS-COMMONS project, which
> lives
> > > > > inside the Web Services TLP, is a set of Java classes that form a
> > > > > data model for W3C XML Schema, together with code to walk DOM of
> > > > > schema documents and build the representation and visa versa.
> > > > >
> > > > > XmlSchema has no capability to validate a document against a
> schema.
> > > > > Typical applications, such as Apache CXF or Apache Axis, spend a
> > > > > fair amount of time converting back and forth between XmlSchema
> > > > > representation and the ordinary DOM for schema documents, if only
> to
> > > > > pass them into the SchemaFactory to tee up validation, or into some
> > > > > other library.
> > > > >
> > > > > While I haven't gone spelunking in the code of Xerces yet, the
> > > > > existence of the validation feature strikes me as strong
> > > > > circumstantial evidence the existence of some representation of
> > > > > schema.
> > > > >
> > > > > It strikes me that two Java class libraries for W3C XML Schema
> > > > > inside ASF is, prima facia, one too many. So, I'm sending this
> email
> > > > > to ask if the Xerces project has interest in working on exposing
a
> > > > > documented API to the XML Schema data model.
> > > > >
> > > > > I was am more or less on the verge of putting a significant pulse
> of
> > > > > effort into modernization and performance enhancement of XmlSchema,
> > > > > and if the same effort could yield a more broadly useful result,
> I'd
> > > > > like to apply it there.
> > > > >
> > > > > I am imagining a scheme where the core representation is dom, using
> > > > > subclasses of DOM interfaces to supply convenience methods for safe
> > > > > and comprehensible access to the abstract data model. This could
> > > > > radically speed up code that needs to handle schema documents as
> DOM
> > > > > and also analyze and manipulate the abstract data model of the W3C
> > > > > Xml Schema. But, imagination aside, I think it would be good to
> > > > > focus energy on a shared solution.
> > > > >
> > > > >
> > > > > Regards,
> > > > > Benson Margulies
>
> --
> Daniel Kulp
> dkulp@apache.org
> http://www.dankulp.com/blog
>

Mime
View raw message