incubator-odf-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Devin Han <devin...@apache.org>
Subject Re: Is there a way to extract text on a page basis from odt ?
Date Wed, 28 Sep 2011 01:38:36 GMT
2011/9/26 Ram Kane <ramdkane@gmail.com>

> I've tried that. The problem is that it works on a document level
>
> I need to be able to extract content for a given page.
>

Does it make sense to extract content by paragraph?


>
> Thx a lot for the code though.
>
>
> On Mon, Sep 26, 2011 at 2:46 AM, Devin Han <devinhan@apache.org> wrote:
> > Hi Ram,
> >
> > I suppose you only want to extract the text(header, footer, comments ,
> end
> > note, etc) and don't care page break.
> > Please see the sample code.
> >
> >       TextDocument
> > textdoc=(TextDocument)TextDocument.loadDocument("textExtractor.odt");
> >       EditableTextExtractor extractorD =
> > EditableTextExtractor.newOdfEditableTextExtractor(textdoc);
> >       String output = extractorD.getText();
> >       System.out.println(output);
> >
> > This code fragment will return all of the context except header and
> > footer.For content in footer and header, please reference.
> >            Header header = textdoc.getHeader();
> >            output =TextExtractor.getText(header.getOdfElement());
> >            System.out.println(output);
> >
> >            Footer footer = textdoc.getFooter();
> >            output =TextExtractor.getText(footer.getOdfElement());
> >            System.out.println(output);
> >
>



-- 
-Devin

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message