xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Graham <ne...@ca.ibm.com>
Subject Re: [VOTE]: motion to transform Xerces into a top-level project as a member of the "federation" of XML projects
Date Mon, 10 May 2004 20:11:43 GMT




Hi Andy,


>>>> The idea of subprojects being simultaneously projects with their
>>>> own subprojects looks like a terminal terminological disaster to
>>>> me.  :)
>
>>> That's because you keep referring to Xerces-J as a sub-project.
>
>> That's because that's what it would technically be if something just
>> called "Xerces" were a TLP (top-level project).
>
> That's where we disagree, I guess.

Indeed it is.  And I'm glad we agree on that:  the first part of resolving
a debate is to characterize precisely what's at issue!

> I consider all the parser
> implementations as "Xerces", the TLP. They are built, packaged,
> and shipped separately but I still consider them all the same
> thing at that level. With that view, Xerces-* are all "Xerces".

And I think it's imperative that we create a charter that recognizes what I
take to be a crucial fact, which is that Xerces-J and Xerces-C are very
different code bases with different architectures and committer
communities, and, to some extent, which have been optimized for differing
uses.  We need a charter that formalizes our long-standing process that a
committer on one project is not considered automatically to be a committer
on the others.

Not that I expect to have any problems here any time soon; but surely part
of the reason for writing a charter is to codify existing practices that
seem to have some value, and this would appear to be one well worth
codifying.

> Then the sub-projects are collective units below that (e.g.
> HTML). Does that make more sense?

I have trouble accepting the idea that the WML DOM implementation that got
dumped on Xerces-J all those years ago has much to do with Xerces-P...

> It certainly is more complicated because we have implementations
> in different languages. I'm not disagreeing with that. But I
> don't want the organization to be based on programming language;
> I'd rather it be based on function. For example, XML parsing is
> the TLP and things like HTML parsing is built from and is a
> sub-project of that.

And so are XSLT processing and XML Security work...  I'm not necessarily
opposed to bringing something like an HTML parser into the Xerces fold, but
it does seem you could argue it both ways.

> break
> down the way the parser is packaged and consider the things
> that we "break out" as sub-projects.

Personally, I have no strong feeling on how we should subdivide the things
that Xerces-J now distributes en masse.  And whether we call them products
of Xerces-J, subsubprojects of Xerces-J, or put them in a sandbox or
Xerces-commons area composed of componentry closely allied to XML parsing,
it's all good.  The only point I'll have to stick to is that there isn't
some magical Xerces entity that contains all the components of Xerces-*
that are considered "core" to parsing.

Cheers,
Neil
Neil Graham
XML Parser Development
IBM Toronto Lab
Phone:  905-413-3519, T/L 969-3519
E-mail:  neilg@ca.ibm.com




                                                                                         
                                             
                      Andy Clark                                                         
                                             
                      <andy@cyberneko.n        To:       xerces-j-dev@xml.apache.org  
                                                
                      et>                      cc:       pmc@xml.apache.org, general@xml.apache.org,
xerces-c-dev@xml.apache.org       
                                               Subject:  Re: [VOTE]:  motion to transform
Xerces into a top-level project as a member  
                      05/06/2004 02:07          of the "federation" of XML projects      
                                             
                      PM                                                                 
                                             
                      Please respond to                                                  
                                             
                      general                                                            
                                             
                                                                                         
                                             
                                                                                         
                                             



Neil Graham wrote:
>>> The idea of subprojects being simultaneously projects with their
>>> own subprojects looks like a terminal terminological disaster to
>>> me.  :)
>
>> That's because you keep referring to Xerces-J as a sub-project.
>
> That's because that's what it would technically be if something just
> called "Xerces" were a TLP (top-level project).

That's where we disagree, I guess. I consider all the parser
implementations as "Xerces", the TLP. They are built, packaged,
and shipped separately but I still consider them all the same
thing at that level. With that view, Xerces-* are all "Xerces".
Then the sub-projects are collective units below that (e.g.
HTML). Does that make more sense?

It certainly is more complicated because we have implementations
in different languages. I'm not disagreeing with that. But I
don't want the organization to be based on programming language;
I'd rather it be based on function. For example, XML parsing is
the TLP and things like HTML parsing is built from and is a
sub-project of that.

>> Perhaps it would be easier to start by defining what we want to
>> ship and then work backwards from there.
>
> Sounds good.  So far, we only ship archives containing jars, DLL's
> and source code that directly relate to XML parsing with standard
> interfaces. What additionally would you like to be able to throw into
> the mix?

Each parser implementation ships a binary and source package.
Just looking at Xerces-J as an example, people have always
complained about the size of the download. So we could break
down the way the parser is packaged and consider the things
that we "break out" as sub-projects.

Currently, the Xerces-J release contains the following:

   * parser (XNI, scanner, configurations)
   * validation (DTD, XML Schema, datatype library)
   * standard APIs (DOM, SAX)
   * HTML (DOM)
   * WML (DOM)
   * serializers
   * utility classes

Did I miss anything?

I'm not suggesting that each one of these be made a separate
package -- I just want a complete list to start talking about
what the current package contains.

The things I see in the future include things like:

   * HTML (scanner, configuration)
   * validation (RelaxNG)
   * additional APIs (pull-parsing)
   * data-binding

 From these lists I think we can decide what pieces are part
of the TLP and what pieces are not. The things that are not
part of the TLP are candidates for being made part of a sub-
project. For example, HTML is a good candidate: it would
include the HTML DOM implementation as well as an HTML parser
built on the Xerces framework.

--
Andy Clark * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org


Mime
View raw message