Return-Path: Delivered-To: apmail-forrest-dev-archive@www.apache.org Received: (qmail 48933 invoked from network); 3 Oct 2004 09:08:32 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 3 Oct 2004 09:08:32 -0000 Received: (qmail 79278 invoked by uid 500); 3 Oct 2004 09:08:31 -0000 Delivered-To: apmail-forrest-dev-archive@forrest.apache.org Received: (qmail 79232 invoked by uid 500); 3 Oct 2004 09:08:30 -0000 Mailing-List: contact dev-help@forrest.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: dev@forrest.apache.org Delivered-To: mailing list dev@forrest.apache.org Received: (qmail 79220 invoked by uid 99); 3 Oct 2004 09:08:30 -0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=DNS_FROM_RFC_WHOIS,FORGED_RCVD_HELO,INITIAL_INVEST,SPF_HELO_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: neutral (hermes.apache.org: local policy) Received: from [196.25.240.73] (HELO ctb-mesg1.saix.net) (196.25.240.73) by apache.org (qpsmtpd/0.28) with ESMTP; Sun, 03 Oct 2004 02:08:28 -0700 Received: from sean.site (ndn-67-228.telkomadsl.co.za [165.165.67.228]) by ctb-mesg1.saix.net (Postfix) with ESMTP id 767B15409; Sun, 3 Oct 2004 11:08:21 +0200 (SAST) From: Sean Wheller Reply-To: sean@inwords.co.za Organization: In Words To: dev@forrest.apache.org Subject: Re: map:transform to docbook xsl Date: Sun, 3 Oct 2004 11:06:24 +0200 User-Agent: KMail/1.6.2 Cc: David Crossley References: <200410021632.07084.sean@enbaya.co.za> <1096767098.19659.970.camel@ighp> In-Reply-To: <1096767098.19659.970.camel@ighp> MIME-Version: 1.0 Content-Disposition: inline Content-Type: Multipart/Mixed; boundary="Boundary-00=_QE8XBjpwDOXk9JY" Message-Id: <200410031106.24893.sean@inwords.co.za> X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N --Boundary-00=_QE8XBjpwDOXk9JY Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline On Sunday 03 October 2004 03:31, David Crossley wrote: > > Patience with me. I am trying to get forrest to use the nwalsh docbook > > stylesheets directly as per > > http://localhost:8888/faq.html#docbook > > And then following the general notes for adding > project sitemaps in the "Using Forrest" document > and the "Project Sitemaps" document. > > I presume that you are using the Forrest-0.6-dev > SVN trunk. Yes. I have followed these instructions over and over and yes I use SVN trunk. Just to make sure. I ripped forrest out, then: 1. did svn checkout of trunk 2. ./build.sh (succeed) env was already set. 3. create new dir and forrest seed (works) 4. execute forrest run from root of new seed project to text if I can see the proj, I can. Stop Jetty. 5. Add to site.xml 7. cp db-article.xml to seed project xdocs/ 8. forrest run. I see changes to menu. 9. Click "XML Solves Problems" db-article.xml is transformed to document (default docbooktodocument.xsl). Stop jetty. 10. Write new sitemap.xmap saved to documentation/sitemap.xmap No need to register the docbook dtd since forrest ships with it and it is already configured in forrest.xmap. I understand the the sitemap.xmap in see documentation/ will take precedence over forrest.xmap 11. Forrest run and try to see the db-article. NO LUCK:-( Just a blank body. As explained in previous message. Thanks for your help. If you have any other questions, let me know. I am considering using Forrest for a new Open Source project and really want it to work as all sources are in docbook xml. I get this right, I will update the forrest docs and send a patch. Attached is the db-article.xml for you to test. -- Sean Wheller Technical Author 084-8549408 sean@inwords.co.za http://www.inwords.co.za --Boundary-00=_QE8XBjpwDOXk9JY Content-Type: text/xml; charset="utf-8"; name="db-article.xml" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="db-article.xml"
XML Solves Tech Publishing Problems Testing 123 XML Solves Tech Publishing Problems Testing 123 Sean Wheller sean@inwords.co.za 2004 2004 In Words. All Rights Reserved. Permission to use, copy, modify and distribute this document for any purpose and without fee is hereby granted in perpetuity, provided that the above copyright notice, this paragraph and acknowledgment of the original author appears in all copies. If you are a technical writer or manager of a documentation group you have probably encountered or been faced with solving problems like: single-sourcing, collaborative authoring, cross-platform editing, multi-channel publishing, improving information quality and consistency, enhancing functionality of electronic output, negating technology lock-in, and even reducing costs without reducing team head count. This article explores how the use of XML technologies within your authoring system can help you achieve each of these objectives. Introduction Technology changes the face of the future by solving past problems and opening avenues for new possibilities. So what happens when technology comes along that changes virtually every other technology? That's what we are about to find out as we explore the impact Extensible Markup Language (XML) is having on the tech publishing industry. XML is a recommendation from the World Wide Web Consortium (W3C). Like HTML (Hyper Text Markup Language), XML is a markup language. Both have their roots in Standard Generalized Markup Language (SGML), but were designed with different goals in mind: XML was designed to describe data and to focus on what data is. HTML was designed to display data and to focus on how data looks. XML is therefore not a replacement for HTML. HTML is purely about displaying information, while XML is solely about describing it. Both contain data, but XML separates the data from the presentation layer. What does this mean to tech publishing departments? How does it benefit and change work processes for the better? From WYSIWYG to WYSIOO To understand the answers we must understand the problems tech publishing departments have with current technologies. However, before we do this we must understand a fundamental shift between the way authors currently interact with text using traditional technologies and how they will do so with the XML based technologies of the future. A key difference between current technologies and XML based systems is the shift away from a WYSIWYG editing environment to WYSIOO. WYSIWYG stands for "What You See Is What You Get." WYSIOO stands for "What You See Is One Option." The difference between these editing environments is founded in the different goals for which HTML and XML were designed. Traditional publishing and help authoring tools provide a WYSIWYG environment for authors to compose their texts. They display pages of information on screen using layout and formatting that is a near exact rendition of how the page will look in the final product. In addition to enabling text composition, they also enable formatting and layout. Much the same as the HTML design goal. This approach is not possible with XML since it was designed to describe data and to focus on what data is, not how data will be displayed or formatted. Authors composing texts stored in XML must use a 'Structured Editor.' This means the editor is focused on two tasks: text composition and valid markup thereof. Since formatting information cannot be saved inside the XML file, the best an editor can do is render temporary formatting and layout for the duration of the editing session. What authors see during the editing session is "One Option" of what they may get in the final product, WYSIOO. WYSIOO and WYSIWYG are opposite ends of the same stick. The two end-points cannot be brought together without breaking the stick. Try as you may, if you do manage to bring these polarities together you will find that you have violated the fundamental laws of one or both of these paradigms. Storing information in a format that is presentation neutral and editing in an environment that does not perpetually mirror the layout and formatting of the final product is strange and disconcerting to people accustomed to working with WYSIWYG. The reason is that they have to relinquish control over the presentation layer. While this should be good news, as it relieves authors from having to pay attention to formatting and layout, it is often most disconcerting to people who are accustomed to using fonts, adding bold, italics and underscore to text. They are also very accustomed to viewing page layout and content flow in the comfort of knowing that, what they see on screen is how it will look in the output. Not being able to see life in this manner unnerves some authors to the extent that they find it hard to concentrate. As trivial as it may seem, some never get used to WYSIOO and prefer to remain with their current Word Processor and Help Authoring Tool. However, separation of data and presentation layers is the key that enables tech publications departments to overcome problems with current technologies and employ new techniques that are better suited to the requirements of today and the future. Solving the Problems So what are the problems current tools cannot solve? What techniques do they not support and so fail in meeting objectives? Broadly speaking there are seven areas where tech publications departments have problems that are hard to solve using traditional technologies, they include: The problems found in each of these areas hinder performance, quality, and cost improvements. Solutions to these problems, using traditional technologies, have been at best partial. They are generally cumbersome and require a high degree of user intervention. These solutions tended to solve a particular problem of one area, instead of addressing all areas in a manner that is cost effective, efficient and responsive. Single-sourcing The need to publish content to multiple platforms, user agents and user types has, more often than not, meant that a single document will be maintained for each target. This approach is fraught with problems, especially when it comes to maintaining a copy for print and online help outputs. There has long been a latent need for technology that could maintain documents in a format suitable for conversion to both targets without incurring a large maintenance overhead. This gave rise to a concept called "single-sourcing." Long the "Holy Grail" of tech publishing, the idea is to write once and publish to many target formats from a since source. The source format never being used for presentation, only for editing. Tools such as RoboHelp, ForeHelp, and HDK all purported to be single-source tools that would enable a person to write once and publish to many. This was not really true, since each of these tools used the online help source format, a presentation format, as the master for authoring and therefore did a pretty poor job of converting to Word document or PDF. With XML the approach is different. Documents are authored in XML which, as we explained, is presentation neutral. The XML is then transformed to a target presentation format using Extensible Stylesheet Language (XSL), another XML format, and an XSL processor; an application that takes XML and XSL as inputs and outputs a new file. This file may be another XML file, a presentational format such as HTML, or XSL-FO a transitional format that can be transform to binary presentation files such as RTF, PDF, or PostScript. This approach is far more effective since the transformation can include any number of custom parameters and functions that automate the process. The transformation from XML to any target format requires far fewer steps and requires minimal user intervention. Collaborative authoring In every authoring process there comes a point where input is required from other people. The problem with traditional formats is that they do not easily enable two or more people to work on the same document simultaneously. In times of pressure or short deadlines this can be a real problem. What is needed is a way to allow people to work on the document in tandem, without clobbering each others work or corrupting the file. XML alone does not solve this problem. However, since XML is a plain text format, as apposed to a binary format typically used by traditional tools, it is light weight, easy to transport and perfect for management under revision management systems such as Concurrent Version System (CVS) or Subversion (SVN). This means that all authors can obtain a copy or version of a document, simultaneously edit it and commit only their changes back to the revision system. Everyone works against a centralized, revision controlled repository. Updating their working copies and committing their changes back for everyone else to see. Note that we said, "only their changes" as apposed to saving the entire file back to the repository which is what you literally do when using binary formats even when working under revision control. In practice this method of managing documents is akin to the method used by multiple programmers who need to collaborate on the same code. In addition to the ability to work on the same document under revision management, managing XML documents in revision management systems such as CVS or SVN enables authors in any geographic location to use the Internet to collaborate on a project wherever they are. Cross-platform editing There are very few tools that enable technical authors to use any operating system or editor to open and edit all of their documents. In most cases, one is forced to use the operating system and a copy of the tool that was originally used to create the document. This is most inconvenient as writers are often required to document software that runs only on a Solaris, Linux, or OS/2 system, when their workstation has a variant of Windows installed and a Windows based Word Processor such as Word. To get around the problem an author has to compose on one computer and experiment on another. This is less than efficient since it means that additional computers are required for each writer, or a single computer must be 'time-shared.' It also slows down the authoring process as the author is frequently forced to move between machines. It is far more efficient to provide an editor that can run on all platforms. One solution may be Open Office , but this would mean installing a large application for what is essentially a simple task. While this is a nice to have situation, with XML it is not mandatory that the application first used to create a document be installed on each computer where the document will be edited. This has the nice side-effect of enabling authors to choose to use any combination of their favorite tools without impacting on, or creating a situation where other authors must use the same tools. Just as XML is presentation-neutral, it is also platform-neutral. Meaning that it can be opened and edited on any OS using any XML or plain text editor. So you can create the document on Windows using a structured editor and then use a plain text editor on any other platform, many of which have tools such as VI or emacs pre-installed. The tools provide modules for structured editing. This last point is hugely advantageous in environments where people need to share documents with colleagues or partners located on the other side of the globe and have no influence on the tools they may use or their organization provides. Multi-channel publishing The concept of multi-channel publishing extends the concept of single-sourcing to include the ability to maintain a single source containing content for all platform, or customer, or user specific differences and nuances. In other words, several types of document can be derived from the same source. For example: your product runs on Linux and Windows. Each version provides OS specific options. You want to capture screens with the look and feel of the OS and document these options. Since the options are OS specific it would be redundant to put them in both books. Using XML it is possible to use a technique called 'profiling.' OS specific screens and options are specially tagged so that they can be included or excluded from the output. Profiling makes it possible for technical authors to author in a single place, making optimum use of common texts and profiling those texts that are OS, customer, or user specific. This saves time, reduces content duplication and delivers information into the output that is relevant to the version and environment the user is working with. Improving information quality and consistency Improving quality is a responsibility shared by the entire organization. An XML system, employing the techniques described, goes a long way to enabling the whole organization to collaborate on and contribute to quality. The portability of XML documents enables other departments to open and edit using any text editor as and when they spot a problem. These changes are tracked by the revision management system where they can be spotted by the authoring group and edited for consistency. The combined and distributed effort of many will mean that documents are kept current with less effort. Authors spend less time doing errata and can focus on adding new sections. With an XML based system the improvement in quality and consistency is also achieved in other ways. Authors have no control over formatting while writing. This is done at the end of the writing process by the stylesheets. They determine all aspects of layout and formatting. Since it is not possible to obtain a formated output without transforming the XML document using XSL, locating stylesheets in a central location and limiting there modification to a few people, ensures that consistency of layout and formatting can be governed across the organization. Another technique made easily possible with 'modular documentation.' The structured format of XML makes it possible to break large chunks of text into smaller parts that can be assembled into multiple larger documents. This significantly increases 'content reuse.' The idea is to write a small chunk of information into a single file and include the text at the appropriate location by reference. The information is not duplicated but stored once and dynamically inserted into the main document during transformation. Increase in content reuse has, over time, an exponential impact on reducing the time required to update information. An update made to a single file is inherited by many larger documents and automatically reflected within the next transformation. Once again, this concept is akin to the way programmers write Object Orientated (OO) code. Already existing objects are reused many times. Modular documentation and content reuse techniques serve to shave thousands of dollars off development costs when implemented correctly. Enhancing functionality of electronic output The engines or viewers used to open and render traditional electronic formats have been limited to a specific set of functionality. This made it difficult for developers to add features and functionality to the document. For example: Let's assume you have a need for users to be able to add comments in line to the paragraphs on a help topic. This would not be possible with traditional help authoring solutions. With XML transformed to HTML, combined with JavaScript and running under a Web Browser this is now possible. The ability to achieve this objective is made possible, once again, because the transformation scenario can include any number of parameters or functions. By programming the transformation to include the specific scripts and integrate them with the transformed presentation output the required functionality is achieved. This step need only be done once and reused countless times by everyone in the organization with the exact same results in formatting and functionality. Negating vendor lock-in This objective is concerned with ensuring that the authoring system is free from proprietary solutions that may become obsolete or tie the organization into a specific vendor product, which has traditionally been the case. Departments based on Microsoft Word or RoboHelp have found that these tools have locked them in to their use for the foreseeable future and excluded them from solving the problems already mentioned. With XML the risk of technology lock-in by proprietary tools is negated since any text editor can open and edit an XML file. The W3C standards ensure this interoperability. An example that is soon to play itself out is the new online help format that Microsoft will ship with its' new product. Code named "Longhorn.", the help engine is based on XML and is a proprietary extension to the Open Source format known as Docbook. The problem is that publications departments using traditional tools will have to convert all their existing documents to the Longhorn format. No doubt vendors will provide upgrade products that port to Longhorn, but the department will still have a cost in licensing and training. These costs would be avoided in they had been working in XML since authors would be focused on content and not formatting or layout. A new set of stylesheets would take care of Longhorn requirements on their behalf. This example demonstrates how using an XML based authoring system can prevent vendor lock-in and ensure compatibility with future technologies and the requirements they bring. Reducing costs Cost reduction is a major initiative of most organizations today. Everybody wants to do more with less. Cost reduction often results in team reduction. This places additional pressure on remaining team members. By employing XML and the practices outlined in this document, publication departments can realize significant cost reductions; without losing team members or sacrificing on improvements in quality and performance. However, before making the change one must first evaluate the technology to determine whether or not XML is a suitable solution. Do you have a need for the objectives we have discussed? If not, then XML is very likely not a solution for you. The cost incurred in moving to such a system could not be justified. However, just because you could benefit from a few of these benefits also does not justify a move without a solid business. In addition, it is wise to keep in mind that the decision to implement such a system is a strategic one. Return On Investment (ROI), is not immediate and the Total Cost of Ownership (TOC), is not always lower that the current system in every instance. Now building a business case is not an easy task. Most people would be seriously challenged if asked to build a business case. Fortunately an example business case is available at IDEAlliance. The document, entitled Developing a Business Case for an XML Authoring System was written by Jean Mercedes Hamilton , a Senior IT Project Manager at SPX Valley Forge, Munich, Germany. It provides a template on which anyone can build a business case that will knock the managers socks off and curl their toes into a fist. Where Next? There are many who would recommend that the best way to implement an XML authoring system is to write your own XML Document Type Definition (DTD) and XSL stylesheets. Of course this is the best option since you can custom design the system to meet your exacting requirements, but this approach requires time and money; both commodities are normally in short supply. To avoid this problem myself and others recommend adopting an already established, mature and Open Source application already mentioned in this document, Docbook. Docbook is a standard managed by OASIS. Docbook defines a markup vocabulary that is particularly well suited to describing technical literature. Managed as an Open Source Project at SourceForge, Docbook is the standard for storage of documentation on Open Source projects, but has also been widely adopted by commercial projects. This standard is by far the most popular starting point for organizations that want to get started with the implementation of an XML authoring system. It provides a comprehensive DTD that covers practically every tag you will ever need to describe technical documents. The project also provides a robust set of XSL stylesheets that enable transformation to formats such as Rich Text Format (RTF), Portable Document Format (PDF), HTML/XHTML, HTML Help, JavaHelp and PostScript. Docbook has a large Open Source following. The community not only develops the application but also supports it via its' mailing list. In addition there are two books written on Docbook that can help technical authors get started and productive. Cost and time savings achieved by using Docbook are immense when compared with the cost and time for rolling your own. Conclusion Changes in technology and the market environment have brought about new requirements that traditional tools and technologies cannot satisfy. An authoring system based on XML technologies enables organizations to meet these requirements while increasing their level of execution and reducing development costs. However, XML authoring systems are not applicable to every situation. Organizations must first evaluate the technology to determine whether or not the initial investment in such a system will add value. Providing a positive business case can be made, there is no need to build a custom system from the ground up. Open Source technologies based on the Docbook standard provide a cost effective way to implement a system and realize a ROI in a shorter period. The decision to move an authoring system to XML is a strategic one. While any such initiative should be founded on a solid business case, numbers alone are not the only reason for implementation. Factors such as technology lock-in and future compatibility can help organizations to avoid future surprises and unexpected costs. About the author Sean Wheller is an information systems architect and technical author. He works as an independent contractor. During his career, spanning 15-years, leading multi-national and numerous startup technology vendors have contracted his services. He works on Linux and has been using XML technologies and Docbook XML for the past three years. When he is not writing for customers, he contributes to Open Source projects. He lives in Johannesburg, South Africa with his wife, three children, two cats, and five penguins. He can be reach at sean.at.inwords.co.za.
--Boundary-00=_QE8XBjpwDOXk9JY--