xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Okomba <Oko...@kabage.co.ke>
Subject RE: [vote] A native XML database project under Apache
Date Fri, 19 Oct 2001 04:27:05 GMT
+1 from me.

> -----Original Message-----
> From:	Stefano Mazzocchi [SMTP:stefano@apache.org]
> Sent:	18 October 2001 22:53
> To:	Apache XML; Kimbro Staken
> Subject:	[vote] A native XML database project under Apache
> 
> Hi,
> 
> while the world of native XML databases is full of marketing hype and
> promises, it is evident (for all those who tried) that mapping general
> XML schemas to relational databases can be sometimes very painful and
> not very efficient.
> 
> In fact, it is widely recognized from the database research community
> that while well structured can be easily and efficiently mapped to a
> relational database, less structured (often called semi-structured) data
> is much more difficult to map.
> 
> Don't get me wrong: there are a number of way to store XML in a database
> to add ACID properties to XML documents, but while this is a
> straightforward process for very repeatitive and well structured schemas
> (invoices, stock quotes, money transactions), it is not so for
> semi-structured schemas such as DocBook, SVG or even XSLT.
> 
> I here you say: I use BLOBS and I'm fine with them. I'm sure you are,
> but in all honesty, I'm not. And for a few reasons:
> 
> 1) each documentation system requires a repository for document. This is
> often called "content management system". Since publishing is going
> toward replacing all content with an XML syntax (and we all love to see
> that happening in full extend), we must consider that such a system will
> require a persistent way to manage the content and a fast and efficient
> way to query it.
> 
> If you use BLOBS you loose an efficient way to look into the blobs
> themselves so you are doomed before you even start.
> 
> You can fragment the XML document into relational mapping to
> semi-structured data (and remember that documentation is almost always
> semi-structured!) but it can be shown that this is hard, very expensive
> and might require (depending on the document schema) a very high number
> of nested queries to translate even a very simple XPath expression.
> 
> Add complexities such as namespaces and the proposed XQL and you see
> that a XQL -> SQL might well be possible but is clearly going to become
> a nightmare to manage and very painful to optimize for efficiency.
> 
> The remaining solution is to create a specific solution that leaves
> structured data to RDBMS (where they really shine, no question about it)
> but moves semi-structured data over to a more specific and
> algorithmically optimized system.
> 
> Note that while ODBMS were supposed to solve the problem of
> semi-structured data, they, in fact, do not.
> 
> This is why we need a native XML DB solution with full support for
> namespaced content, XPath and XQL for querying, RDF for metadata.
> 
> 2) so, the content management system that everybody is crying out loud
> for requires a storage solution and I believe that a native XML DB is
> the way to go.
> 
> Also because:
> 
> 3) if we ever want to get deeper into the semantic web (and I,
> personally, want), we must forget well structured data. Vocabularies
> such as RDF, RDFSchema, Topic Maps and the like are *not* going to be
> easily mapped into relational databases and efficiently searched.
> 
> So, this is why I propose the creation of a project hosted here under
> xml.apache.org to implement this effort.
> 
> Since it's generally very hard to bootstrap an open development
> community without some code to start working on, I suggest to start this
> project over the code that the dbXML guys are willing to donate to the
> ASF in order to create such development community that can research and
> implement in this new field and, by doing so, hopefully lead the way
> reducing the marketing crap and the hype around this.
> 
> FYI, dbXML (www.dbxml.org) is an implementation of a native XML database
> written in the Java language that is close to reaching its first final
> release.
> 
> I've been talking to one of the community leaders (here copied) that
> independently came out with my same conclusion and wanted to propose
> dbXML for donation even before I expressed my intentions.
> 
> Also Sam Ruby has been subscribed to their development list watching
> over them.
> 
> dbXML was created with the sponsor of a commercial entity called "dbXML
> Group" which still exists but has no economic energy to continue its
> development and the main developers are now working on the project
> unpaid.
> 
> But I'd like something to be clear: I'm *NOT* proposing that Apache
> takes over 'dbXML group' to save dbXML and continue its development. I'm
> proposing that Apache creates a new project for the creation of a
> production quality native XML database solution that implements existing
> and future standards (and hopefully have the power to influence their
> establishment) and that in order to help bootstrap the community, we
> start with the current dbXML implementation which is going to be donated
> to the ASF.
> 
> To show this and to avoid confusion with past releases and the "dbXML
> group" commercial entity, the project is *NOT* going to be called Apache
> dbXML, but rather something without acronims, in the spirit of
> xml.apache.org.
> 
> Kimbro and I have been talking about "Apache BooBoo", but that is just
> the first name that crossed my mind :) If you have better names, please,
> let us discuss this publicly if the deal gets approuved.
> 
> Anyway, the dbXML folks are willing to donate the code, to change the
> name as long as we give proper credit to "dbXML group" for having
> bootstrapped and donated the code (as we do for IBM, Lotus, Sun and
> others), and more than willing to help in both development, user
> support, research, community and evangelization. In fact, if the deal is
> accepted by this list, they are even willing to close down the site and
> move everything overhere with the new name.
> 
> Let me finish by saying that I do not consider important what the actual
> code implementation is (few, myself included, might not like some of
> their architectural choices, such as the use of CORBA and Jaggernaut),
> but I'm *NOT* asking for a vote on their _actual_ technological status,
> I'm asking for a vote to create a community that can create, maintain
> and show the power of a native XML DB solution.
> 
> It might takes years to have something solid enough to compete with big
> commercial names, but it is important, IMO, for Apache to have something
> to say even on this front by creating a community and attracting people
> and their ideas.
> 
> In fact, the dbXML guys are willing to donate the code, but also very
> happy about the possibility of a higher visibility that would bring more
> people and more ideas into the design process that is going to happen
> for their next major release.
> 
> So, people, I'm asking you to judge the idea to create a community,
> rather than the current dbXML implementation which is only a way to give
> to users the meat the look for in that area, but then attract them for
> new development and further research.
> 
> Sorry for the long mail.
> 
> Please, place your vote.
> 
> Thanks.
> 
> Stefano.
> 
> 
> 
> ---------------------------------------------------------------------
> In case of troubles, e-mail:     webmaster@xml.apache.org
> To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
> For additional commands, e-mail: general-help@xml.apache.org

---------------------------------------------------------------------
In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org


Mime
View raw message