xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pierpaolo Fumagalli <p...@apache.org>
Subject Re: PDF to XML
Date Thu, 27 Jan 2000 15:39:21 GMT
Paul.Waugh@wdr.com wrote:
> Pier, apparantly, according to there brochure ReachCast have a product
> that can take PDF files and convert them to an XML document. I do not
> know how much flexibility you have on the conversion process.
> This conversion is of specific importance to me, as a project I'm
> about to work on has a number of legacy PDF docs that need to be
> converted into XML.

Yep, I've found it... But, as I said, they convert PDF to HTML or XML.
I believe (that's the only thing I can imagine) that when they convert
to XML, what they're really doing is taking the PDF and styling it to a
XHTML+CSS format.
That's the only thing logically possible, but, in that case, you loose
the power of XML, its ability to give a context to the content...

I've seen a similar tool used by Mike Pogue... He did show me once a
printer driver for windows that was outputting HTML to a file. I believe
you can use the same tool to print your PDF to this "HTML printer" and
display them on line. (Or, from HTML, convert them into XHTML, and try
to do something from there).

Mike, what was the tool you were using ????


-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<mailto:pier@betaversion.org>    <http://www.betaversion.org/~pier/>
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

View raw message