incubator-odf-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ram Kane <ramdk...@gmail.com>
Subject Re: Is there a way to extract text on a page basis from odt ?
Date Mon, 26 Sep 2011 13:56:21 GMT
Thanks all for the replies.


> It seems best to revisit the problem statement and extract a
> grounded case: What is the problem that needs to be solved;
> what are the constraints on an acceptable solutions.
>
> Ram, can you please say more about the problem you want to solve?
> What would be the simplest-acceptable result?


I need to extract content for a given page inside a doc. By content i
mean header, footer, footnotes, comments, main text from body.
I need to have the option of extracting each of these elements of the
page separately (extracting header for page X, footer for page X, body
text for page X) and not just getting all the content as a single
string.

I've uploaded a doc that i found on your svn to use as an example here
-> http://goo.gl/OMIEw

Using the example doc and assuming that i need to extract content for
page 1, i'd need to extract:

    _ header ("ODFDOM in a header")
    _ footer ("ODFDOM in a footer")
    _ footnotes for page ("ODFDOM in a footnote")
    _ main text and all additional content in the page body (" ODFDOM
in a title ODFDOM in a section header ODFDOM in paragraph1 ..."

Mime
View raw message