poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suba Suresh <su...@wolfram.com>
Subject Re: PowerPoint extractor
Date Tue, 27 Jun 2006 21:34:25 GMT
Thank you for all the pointers.  It is a great help. I used today's 
build. It worked fine for WordDocument. I did not try the meta data yet. 
For PowerPoint I am getting the following for powerpoint extractor just 
for one file. Am I doing anything wrong? I did'nt change my code.

No core record found with ID 3 based on PersistPtr lookup
No core record found with ID 10 based on PersistPtr lookup
No core record found with ID 12 based on PersistPtr lookup
No core record found with ID 13 based on PersistPtr lookup
No core record found with ID 16 based on PersistPtr lookup
......
......
......
......
No core record found with ID 246 based on PersistPtr lookup

PowerPointExtractor ppExtractor = new PowerPointExtractor(new 
FileInputStream(filename.ppt));
     String text = ppExtractor.getText();

Also since some the excel files were not 97-2002 format I used the 
POIFSFilesystem and read it as a bytestream and stored as text string. I 
hope that is fine.

thanks,
suba suresh.

Nick Burch wrote:
> On Mon, 26 Jun 2006, Suba Suresh wrote:
> 
>>I can go to the link and download the file to bugzilla. Is there any
>>procedure I have to follow? What is the link to bugzilla?
> 
> 
> Just follow the "Bug Database" link from the sidebar when at
> http://jakarta.apache.org/poi/. That said, I've updated the slide building
> code today, so your problem might now be fixed. Try a new SVN build, and
> report back :)
> 
> 
>>On an aside note I am trying to do the same with word document file with
>>poi hdf library. I just want to extract text. How can I do it
> 
> 
> You'll be better of with hwpf. See another post to the list today for a
> guide
> 
> 
>>and also how can I extract meta data from all the microsoft format
>>files.
> 
> 
> For that, you'll want hpsf:
> 	http://jakarta.apache.org/poi/hpsf/index.html
> 
> Nick
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/


---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/


Mime
View raw message