poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashaelon <scott.w.p...@gmail.com>
Subject Re: Read bookmarks in Office 2003 Word Document
Date Tue, 31 Jan 2012 19:08:56 GMT

Ashaelon wrote
> 
> Hi all -
> 
> I am very new to Java programming, so if I ask or refer to something
> wrong, or do not explain things well, please be gentle.  :)
> 
> I am using HWPF to try and read some bookmarks in an Office 2003 Word
> Document.  I have searched and searched for answers, but have been
> unsuccessful.  I have seen posts where people have said they have
> successfully read the bookmarks using Apache POI, but I haven't been able
> to decipher how they did it.  Here is the code that I am using:
> 
> try {
> 	FileInputStream fis = new FileInputStream(fileLoc);
> 	HWPFDocument wdDoc = new HWPFDocument(fis);
> 						
> 	try {
> 		DefaultTableModel tblModel = new DefaultTableModel();
> 		
> 		Bookmarks bkmkList = wdDoc.getBookmarks();
> 		int bkmkCount = bkmkList.getBookmarksCount(); 
> //wdDoc.getBookmarks().getBookmarksCount();
> 		
> 		for (int i = 0; i < bkmkCount; i++) {
> 			Bookmark bkmk = bkmkList.getBookmark(i);
> 			List<String> bkmkInfo = new ArrayList<String>();
> 			Range bkmkRange = new Range(bkmk.getStart(), bkmk.getEnd(), wdDoc);
> 			bkmkInfo.add(bkmk.getName());
> 			bkmkInfo.add(bkmkRange.text());
> 								
> 			tblModel.addRow(bkmkInfo.toArray());  //this part is broken right now
> 			
> 			bkmkInfo.clear();
> 		}
> 						
> 		String[] aColNames = {"Bookmark Name", "Bookmark Data"};
> 		tblModel.setColumnIdentifiers(aColNames);
> 		
> 		tblList.setModel(tblModel);  //this part is broken right now
> 	} catch (Exception ex) {
> 		JOptionPane.showMessageDialog(null, ex.getStackTrace(), "Error",
> JOptionPane.ERROR_MESSAGE);
> 	} finally {
> 		if (fis != null) {
> 			try {
> 				fis.close();
> 				fis = null;
> 			} catch (Exception e) {
> 				// Do nothing
> 				e.printStackTrace();
> 			} finally {
> 				wdDoc = null;
> 			}
> 		}
> 	}
> } catch (Exception e) {
> 	JOptionPane.showMessageDialog(null, e.getStackTrace(), "Error",
> JOptionPane.ERROR_MESSAGE);
> }
> 
> I am failing to see how to get only the text of the bookmark range.  The
> layout of the document is there are bookmarks inside of a table.  When I
> call:
> 
> bkmkRange.text()
> 
> I get the entire row of data returned instead of just the bookmark range. 
> I notice that getStart(); and getEnd(); return the range start and end of
> the entire column.
> 
> Any suggestons of how to fix this would be much appreciated.  I am at a
> complete loss.
> 


Just an update.  I did a test with reading the bookmarks with the code
above.  I think I found what may be causing the issue, but i dont know how
to fix it (fixing the Word documents is not an option).  The problem seems
to be that the table cells are bookmarked, not the data in the cells.  For
some reason, this causes the .getStart and .getEnd to return the range of
the table row, not the table cell.  I truly am at a complete loss now.  How
do I read the data out of the bookmarked table cell, or can I?

I have attached an image of my test document to show what I did.  The
"Bookmarked Text" part works properly since the text in the cell is
bookmarked versus the cell itself being bookmarkded.  The "Bookmarked Cells"
part does not work correctly.

Thanks in advance.

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/Read-bookmarks-in-Office-2003-Word-Document-tp5431394p5445484.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Mime
View raw message