poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiangpeng Shi" <Jiangpeng....@UTSouthwestern.edu>
Subject Re: A newbie question: how to get image position?
Date Thu, 10 Jun 2010 15:51:20 GMT
Hey Mark, 

Thank you very much for the help. I learned a lot about POI from you in this couple of days.
Thank you, David, too. I checked the source code about ExcherContainerRecord, but I think
I am not that smart to make any change in that class to make things work. :-( 
I also checked the source code of HSSFWorkbook, and trace the method getAllPictures() and
searchForPictures(List escherRecords, List<HSSFPictureData> pictures). I can see that
all the pictures data are retrieved by this two methods. And it looks like image record is
saved in EscherBlipRecord, as a child of EscherBSERecord. Unfortunately I couldn't get any
position information from this two class. 

I will keep working on this issue, and for sure I will post here if I get any progress. Thanks
a lot for all your help. I really appreciate it. 

--Jerry



>>> MSB <markbrdsly@tiscali.co.uk> 6/10/2010 10:28 AM >>>

Hello Jerry,

David is quite correct to point us in the direction of the fillFields()
method and I have managed to replicate the problem myself by creating a
worksheet with around fifty images to confirm this; processing that workbook
produced the same error message - well similar as the number of bytes
remaining was different.

The check is really a sensible one to make as it prevents the code from
throwing an IndexOutOfBoundsException or something similar. Whilst the
solution is fairly obvious - we need to get that remaining data - the bad
news is that I will not be able to devote much time to the problem over the
next few days and do not have any confidence in my ability to work through
the API's code to identify a solution in any case. So, you may want to raise
this as a bug so that it comes to the attention of more capable members
people. Having said that, I will continue to trace through the sequence of
method calls to see if anything obvious presents itself.

Yours

Mark B


jerry-112 wrote:
> 
> Mark, thank you so much for your help. I think your code works very well:
> we can get details information about class name and image information.
> There is one thing I couldn't understand is that it looks like POI only
> read part of my spreadsheet. I didn't notice before, on the console
> output, every time I got a warning information like: 
> 
> ...
> 
> WARNING: 15343 bytes remaining but no space left
> WARNING: 15343 bytes remaining but no space left
> .....
> 
> I googled around and didn't find any clue about this. Here is the of
> output of code: 
> 
> 
> 
> 
> 
> WARNING: 15343 bytes remaining but no space left
> WARNING: 15343 bytes remaining but no space left
> org.apache.poi.ddf.EscherDgRecord
> org.apache.poi.ddf.EscherContainerRecord
> org.apache.poi.ddf.EscherContainerRecord
> org.apache.poi.ddf.EscherSpgrRecord
> org.apache.poi.ddf.EscherSpRecord
> org.apache.poi.ddf.EscherContainerRecord
> org.apache.poi.ddf.EscherSpRecord
> org.apache.poi.ddf.EscherOptRecord
> org.apache.poi.ddf.EscherClientAnchorRecord
> The top left hand corner of the image can be found in the cell at column
> number 0 and row number 1 at the offset position x 177 and y 11
> co-ordinates.
> The bottom right hand corner of the image can be found in the cell at
> column number 0 and row number 1 at the offset position x 842 and y 221
> co-ordinates.
> org.apache.poi.ddf.EscherClientDataRecord
> org.apache.poi.ddf.EscherContainerRecord
> org.apache.poi.ddf.EscherSpRecord
> org.apache.poi.ddf.EscherOptRecord
> org.apache.poi.ddf.EscherClientAnchorRecord
> The top left hand corner of the image can be found in the cell at column
> number 0 and row number 2 at the offset position x 333 and y 11
> co-ordinates.
> The bottom right hand corner of the image can be found in the cell at
> column number 0 and row number 2 at the offset position x 687 and y 221
> co-ordinates.
> org.apache.poi.ddf.EscherClientDataRecord
> org.apache.poi.ddf.EscherContainerRecord
> org.apache.poi.ddf.EscherSpRecord
> org.apache.poi.ddf.EscherOptRecord
> org.apache.poi.ddf.EscherClientAnchorRecord
> The top left hand corner of the image can be found in the cell at column
> number 0 and row number 3 at the offset position x 233 and y 11
> co-ordinates.
> The bottom right hand corner of the image can be found in the cell at
> column number 0 and row number 3 at the offset position x 786 and y 221
> co-ordinates.
> org.apache.poi.ddf.EscherClientDataRecord
> org.apache.poi.ddf.EscherContainerRecord
> org.apache.poi.ddf.EscherSpRecord
> org.apache.poi.ddf.EscherOptRecord
> org.apache.poi.ddf.EscherClientAnchorRecord
> The top left hand corner of the image can be found in the cell at column
> number 0 and row number 4 at the offset position x 190 and y 11
> co-ordinates.
> The bottom right hand corner of the image can be found in the cell at
> column number 0 and row number 4 at the offset position x 825 and y 221
> co-ordinates.
> org.apache.poi.ddf.EscherClientDataRecord
> org.apache.poi.ddf.EscherContainerRecord
> org.apache.poi.ddf.EscherSpRecord
> org.apache.poi.ddf.EscherOptRecord
> org.apache.poi.ddf.EscherClientAnchorRecord
> The top left hand corner of the image can be found in the cell at column
> number 0 and row number 5 at the offset position x 190 and y 11
> co-ordinates.
> The bottom right hand corner of the image can be found in the cell at
> column number 0 and row number 5 at the offset position x 821 and y 221
> co-ordinates.
> org.apache.poi.ddf.EscherClientDataRecord
> org.apache.poi.ddf.EscherContainerRecord
> org.apache.poi.ddf.EscherSpRecord
> org.apache.poi.ddf.EscherOptRecord
> org.apache.poi.ddf.EscherClientAnchorRecord
> ..........
> 
> 
> org.apache.poi.ddf.EscherClientDataRecord
> org.apache.poi.ddf.EscherContainerRecord
> org.apache.poi.ddf.EscherSpRecord
> org.apache.poi.ddf.EscherOptRecord
> org.apache.poi.ddf.EscherClientAnchorRecord
> The top left hand corner of the image can be found in the cell at column
> number 0 and row number 50 at the offset position x 294 and y 11
> co-ordinates.
> The bottom right hand corner of the image can be found in the cell at
> column number 0 and row number 50 at the offset position x 726 and y 221
> co-ordinates.
> org.apache.poi.ddf.EscherClientDataRecord
> org.apache.poi.ddf.EscherContainerRecord
> org.apache.poi.ddf.EscherSpRecord
> org.apache.poi.ddf.EscherOptRecord
> org.apache.poi.ddf.EscherClientAnchorRecord
> The top left hand corner of the image can be found in the cell at column
> number 0 and row number 51 at the offset position x 302 and y 11
> co-ordinates.
> The bottom right hand corner of the image can be found in the cell at
> column number 0 and row number 51 at the offset position x 721 and y 221
> co-ordinates.
> org.apache.poi.ddf.EscherClientDataRecord
> org.apache.poi.ddf.EscherTextboxRecord
> 
> 
> It lists 52 images for first 52 rows, and then truncated from there. There
> are about 10 images just left out....Is that possible related with that
> warning information? 
> 
> I will look into the org.apache.poi.hssf.util.CellReference class too,
> more carefully. Thank you very much for giving me a direction to look
> like. Before this I was just shooting in the dark....
> 
> Thanks again. 
> 
> --Jerry
> 
> 
>>>> MSB <markbrdsly@tiscali.co.uk> 6/9/2010 10:23 AM >>>
> 
> We are toying with the Escher Layer which is a little bit of a beast and
> so I
> think we need to do a bit of digging to find out what exactly you are
> working with.
> 
> The first thing you might try to do is letting the code tell you which
> records it is finding as it is parsing the nested series of records. All
> you
> need to do for that is add a line to the iterateRecords() method so that
> the
> main loop now looks like this;
> 
> while(recordIter.hasNext()) {
>    childRecord = recordIter.next();
>    System.out.println(childRecord.getClass().getName());
>    if(childRecord instanceof EscherClientAnchorRecord) {               
>       this.printAnchorDetails((EscherClientAnchorRecord)childRecord);
>    }
>    if(childRecord.getChildRecords().size() > 0) {
>       this.iterateRecords(childRecord, ++level);
>    }
> }
> 
> That at least will tell us what we are dealing with and it may well become
> apparant that ather are other types of anchor record associated with those
> images that we also need to test for. So, if you see another type of
> anchor
> record listed, it could well be worth amending the if stamenet so that it
> includes a check for either EscherClientAnchorRecords or whatever other
> type
> of anchor record you find - assuming there is one of course. 
> 
> If that does not work, the best course of action would be to upload the
> workbook somewhere so that I can get my hands on it and dig around a
> little.
> This assumes that you can let me have the file and you must check
> carefully
> with your manager or client before doing so.
> 
> Yours
> 
> Mark B
> 
> PS I did not include this in the first iteration of the code but it is
> something you may llike to look into yourself. Have a look at the
> org.apache.poi.hssf.util.CellReference class, it makes it a trivial task
> to
> convert between POI's number based and Excel's letter/number based cell
> references - that could avoid some confusion when you are dealing with
> lots
> of images and trying to convert from one indexing scheme to another in
> your
> head. 
> 
> 
> jerry-112 wrote:
>> 
>> 
>> Mark, thank you very much for the help. I tried this code and it works
>> pretty good. I can get a list of column and row number. But then I
>> noticed
>> it couldn't retrieve all the images:
>> 
>> I tried to get all images from a single sheet xls file by flowing code: 
>> 
>>                                InputStream myxls = new
>> FileInputStream(filename);
>> 		HSSFWorkbook wb     = new HSSFWorkbook(myxls);
>> 		List list = wb.getAllPictures();
>> 
>> I can get a list of Pictures as total as 65 element; but when I tried to
>> get their position data by the code you provided, it only return 55
>> positions. I double checked spreadsheet and confirm there are 65 Pictures
>> in that sheet. I guess something missed over there. I might be confused
>> by
>> those Children, etc....Any suggestion is appreciated. Thanks again for
>> the
>> help. 
>> 
>> --Jerry 
>> 
>> 
>> 
>> 
>>>>> MSB <markbrdsly@tiscali.co.uk> 6/8/2010 9:33 AM >>>
>> 
>> I should have known that would be too easy!
>> 
>> This morning, I managed to write the code to recover the anchor
>> information
>> for images inserted into one of the older, binary, Excel workbooks;
>> 
>> /*
>>  * To change this template, choose Tools | Templates
>>  * and open the template in the editor.
>>  */
>> 
>> package imagematrices;
>> 
>> import java.io.File;
>> import java.io.FileInputStream;
>> import java.io.IOException;
>> import java.io.FileNotFoundException;
>> import java.io.FilenameFilter;
>> import java.util.Iterator;
>> import java.util.ArrayList;
>> import java.util.List;
>> 
>> import org.apache.poi.hssf.record.EscherAggregate;
>> import org.apache.poi.ddf.EscherRecord;
>> import org.apache.poi.ddf.EscherClientAnchorRecord;
>> import org.apache.poi.ss.usermodel.WorkbookFactory;
>> import org.apache.poi.ss.usermodel.Workbook;
>> import org.apache.poi.hssf.usermodel.HSSFWorkbook;
>> import org.apache.poi.hssf.usermodel.HSSFSheet;
>> import org.apache.poi.xssf.usermodel.XSSFWorkbook;
>> import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
>> 
>> /**
>>  *
>>  * @author win user
>>  */
>> public class Main {
>> 
>>     private ArrayList<File> excelFiles = null;
>> 
>>     public void getImageMatrices(String folderName)
>>             throws IOException, FileNotFoundException,
>> InvalidFormatException {
>>         File fileFolder = new File(folderName);
>>         File[] excelWorkbooks = fileFolder.listFiles(new
>> ExcelFilenameFilter());
>>         for(File excelWorkbook : excelWorkbooks) {
>>             Workbook workbook = WorkbookFactory.create(new
>> FileInputStream(excelWorkbook));
>>             if(workbook instanceof HSSFWorkbook) {
>>                 this.processImages((HSSFWorkbook)workbook);
>>             }
>>             else {
>>                 this.processImages((XSSFWorkbook)workbook);
>>             }
>>         }
>>     }
>> 
>>     private void processImages(HSSFWorkbook workbook) {
>>         EscherAggregate drawingAggregate = null;
>>         HSSFSheet sheet = null;
>>         List<EscherRecord> recordList = null;
>>         Iterator<EscherRecord> recordIter = null;
>>         int numSheets = workbook.getNumberOfSheets();
>>         for(int i = 0; i < numSheets; i++) {
>>             System.out.println("Processing sheet number: " + (i + 1));
>>             sheet = workbook.getSheetAt(i);
>>             drawingAggregate = sheet.getDrawingEscherAggregate();
>>             if(drawingAggregate != null) {
>>                 recordList = drawingAggregate.getEscherRecords();
>>                 recordIter = recordList.iterator();
>>                 while(recordIter.hasNext()) {
>>                     this.iterateRecords(recordIter.next(), 1);
>>                 }
>>             }
>>         }
>>     }
>> 
>>     private void iterateRecords(EscherRecord escherRecord, int level) {
>>         List<EscherRecord> recordList = null;
>>         Iterator<EscherRecord> recordIter = null;
>>         EscherRecord childRecord = null;
>>         recordList = escherRecord.getChildRecords();
>>         recordIter = recordList.iterator();
>>         while(recordIter.hasNext()) {
>>             childRecord = recordIter.next();
>>             if(childRecord instanceof EscherClientAnchorRecord) {
>>                
>> this.printAnchorDetails((EscherClientAnchorRecord)childRecord);
>>             }
>>             if(childRecord.getChildRecords().size() > 0) {
>>                 this.iterateRecords(childRecord, ++level);
>>             }
>>         }
>>     }
>> 
>>     private void printAnchorDetails(EscherClientAnchorRecord
>> anchorRecord)
>> {
>>         System.out.println("The top left hand corner of the image can be
>> found " +
>>                 "in the cell at column number " +
>>                 anchorRecord.getCol1() +
>>                 " and row number " +
>>                 anchorRecord.getRow1() +
>>                 " at the offset position x " +
>>                 anchorRecord.getDx1() +
>>                 " and y " +
>>                 anchorRecord.getDy1() +
>>                 " co-ordinates.");
>>         System.out.println("The bottom right hand corner of the image can
>> be
>> found " +
>>                 "in the cell at column number " +
>>                 anchorRecord.getCol2() +
>>                 " and row number " +
>>                 anchorRecord.getRow2() +
>>                 " at the offset position x " +
>>                 anchorRecord.getDx2() +
>>                 " and y " +
>>                 anchorRecord.getDy2() +
>>                 " co-ordinates.");
>>     }
>> 
>>     private void processImages(XSSFWorkbook workbook) {
>>         System.out.println("No support yet for OOXML based workbooks.
>> Investigating.");
>>     }
>> 
>>     /**
>>      * @param args the command line arguments
>>      */
>>     public static void main(String[] args) {
>>         try {
>>             new Main().getImageMatrices("C:/temp/Excel");
>>         }
>>         catch(Exception ex) {
>>             System.out.println("Caught an: " + ex.getClass().getName());
>>             System.out.println("Message: " + ex.getMessage());
>>             System.out.println("Stacktrace follows:.....");
>>             ex.printStackTrace(System.out);
>>         }
>>     }
>> 
>>     public class ExcelFilenameFilter implements FilenameFilter {
>> 
>>         public boolean accept(File file, String fileName) {
>>             boolean includeFile = false;
>>             if(fileName.endsWith(".xls") || fileName.endsWith(".xlsx")) {
>>                 includeFile = true;
>>             }
>>             return(includeFile);
>>         }
>>     }
>> }
>> 
>> As you can see, I have had to dig into the bowels of the POI record
>> structure to get at the image's location. This same tactic will not work
>> for
>> the OOXML based workbooks and I am still lloking into how to recover that
>> information but expect it to be much easier and to have something to do
>> with
>> relations but I cannot be sure yet. Will post again if I find anything
>> out.
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> jerry-112 wrote:
>>> 
>>> Hey guys, 
>>> 
>>> I am a new comer to POI framework. In one of my project, I need to read
>>> images from a .xls file. For each row there is a column contains an
>>> image
>>> and I need to read it out. It looks like I can read all images together,
>>> but how can I get images position, like column number, row number so I
>>> can
>>> related those images with other data? Any suggestion is highly
>>> appreciated. Thanks. 
>>> 
>>> --Jiangpeng Shi
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org 
>>> For additional commands, e-mail: user-help@poi.apache.org 
>>> 
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/A-newbie-question%3A-how-to-get-image-position--tp28813811p28818753.html

>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org 
>> For additional commands, e-mail: user-help@poi.apache.org 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org 
>> For additional commands, e-mail: user-help@poi.apache.org 
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/A-newbie-question%3A-how-to-get-image-position--tp28813811p28831912.html

> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org 
> For additional commands, e-mail: user-help@poi.apache.org 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org 
> For additional commands, e-mail: user-help@poi.apache.org 
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/A-newbie-question%3A-how-to-get-image-position--tp28813811p28844679.html

Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org 
For additional commands, e-mail: user-help@poi.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Mime
View raw message