cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Reitman <>
Subject Re: Component approach to web development using Cocoon
Date Wed, 05 Jan 2000 18:11:32 GMT
Berin Loritsch wrote:

> XML: Specifies content and meta data.  This is the perfect area for business
>          analysts or columnists (if its a news site).  If your site ever incorporates
>          a search function, you will be able to return more accurate results if
>          you use a standard schema for your site.  While you can use XHTML,
>          it won't help you with the search functions.  The DocBook schema,
>          although officially it is SGML, is a very good basis to use for a site.

At Dow Jones we created a system that converted the news articles into
XML, for searching, templating and other reasons.  We had to convert our
XML stored documents back to HTML to search them because search engines
don't interpret XML or DTDs, at least not the ones I've used, and they
rely heavily on meta data.

Meta data requires human intervention and its a little tricky since one
editor might interpret the same articles differently, at least in terms
of what characterizes it.

Still, XML (SGML) is the way to go for news sites. 

What we need is an editor that will convert documents to XML and using
fuzzy logic and meta tagging patterns (historical statistics) can enrich
the documents, convert the tags to data definitions and a search engine
that can read DTDs.


> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View raw message