poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Levi Strope" <lstr...@besttransport.com>
Subject RE: There has got to be a way
Date Wed, 02 May 2007 14:47:05 GMT
Forgive me for not clarifying earlier, this is using HSSF.

To answer your question Dave, Yes - I have taken that into
consideration.  I'm not saying my code is correct in looking for the
blank cells with the BlankRecord listener, but according to my limited
knowledge and the javadoc a blankrecord is any record without data that
contains styling information.  That's why I'm asking for help - I would
love it if the answer is just to use a different type of listener, but I
need someone to tell me what that would be.

So here is what I did last night:
1) I went through each and every blank cell in the row and applied a
default styling.  After running my code again it returned 1 less blank
record than it did before!  I would have expected it to return more.
2) This got me to thinking that I was completely backwards, so I went in
and removed formatting from the blank cells and ran my test again.  This
time it returned the same number of blank records, only it reported
these blank records on different columns!  

I've tried adding a MulBlank record listener - and that doesn't detect
any of my cells where there might be 3 blanks in a row.  I'm not sure
what the criteria is for this but according to the javadoc I believe it
should return at least 2 mulblank records with the excel file I
provided.  According the the javadoc it says that all MulBlank records
are converted into individual blankrecords.  I'm starting to believe
there may be something wrong with this conversion.  The reason why I
believe this is in the 2 tests that I noted above, the inconsistencies
in the columns that were reported as blanks occurred where there might
be more than 1 blank column side by side. It's almost as if HSSF sees
these columns as the same column and is reporting them as 1 blank record
when it should be more than one.

-Levi  

-----Original Message-----
From: David Henry [mailto:dave.henry@fedex.com] 
Sent: Wednesday, May 02, 2007 10:18 AM
To: POI Users List
Subject: RE: There has got to be a way

Levi,
 
I haven't dug through all your code and spreadsheet, but have you
considered what the definition of "blank" is?  In many similar
circumstances I have found that relying upon isBlank, for example, may
not be sufficient if you have a mix of true blank (i.e. <null> value)
cells and cells with zero length data (i.e. looks "blank" but, by
Excel's reckoning contains a value).  Sometimes something as simple as
tabbing to the cell may be sufficient to change a "blank" to a
zero-length value.  You may have already taken this into consideration,
but thought I'd mention it as a first check.
 
- David

________________________________

From: Levi Strope [mailto:lstrope@besttransport.com]
Sent: Tuesday, May 01, 2007 10:11 PM
To: POI Users List
Subject: There has got to be a way


Please look at the attached excel spreadsheet.  I would really
appreciate some input.
 
All I need to do is read in a row of data, just like the spreadsheet
that is attached, and parse the information reliably.  There are many
blank cells here, but I need to account for them.  
 
My business problem is this:  We are accepting messages where columns of
information are not required, but the column MUST be represented in the
file as we process it.  At minimum I'd like to be able to read in these
excel files and output them in a CSV format.  Essentially as I am
streaming the excel file out, I am appending comma's between the values
because our messaging system expects the values to be comma delimited.
Unfortunately we accpet XLS files so we cannot just ask them to save it
as a CSV.  This can happen in 1 of 2 ways, I can stream the file and
append the commas between the cells as I am sending it to our messaging
system, or I can stream it out to a file and build an actual CSV, and
then tell the messaging system to pick it up.
 
When I read in an excel file I am finding it very difficult to determine
when cells are blank, and account for them.  For some reason the
BlankRecordListener doesn't pickup all blank records.  I've even tried
applying formatting to all cells and running my code, it still does not
work.
 
I put code in the blank record listener code to tell me the column.
Here is what I get:
 
Blank record encounterred.  Column= 3
Blank record encounterred.  Column= 4
Blank record encounterred.  Column= 7
Blank record encounterred.  Column= 8
Blank record encounterred.  Column= 10
Blank record encounterred.  Column= 18
Blank record encounterred.  Column= 20
Blank record encounterred.  Column= 30
Blank record encounterred.  Column= 34

There is no conditional formatting in this code, all I'm doing is
listening for a blank record and doing a System.out. If you look at the
attached spreadsheet you can see that columns 21, 22, 24, 32, and 33 are
blank as well.  There are 14 blank records in this spreadsheet but the
blankrecord listener only reports 9. 
 
I need help.  I need someone to tell me what kind of record listener I
need to use.  How can I account for these cells?
 
Here is my code.  At present it skips the blankcells when writing it out
to a CSV.   
 
/*
 *
 * Created on April 5, 2007, 11:49 AM
 *
 *
 *
 * @author lstrope
 */
 
package poitest;
 
import java.io.*;
 
import org.apache.poi.poifs.filesystem.*;
import org.apache.poi.hssf.eventusermodel.*;
import org.apache.poi.hssf.record.*;
 
public class PoiXLStest implements HSSFListener {
    private SSTRecord sstrec;
    int rowRecLen[] = new int[RowRecord.MAX_ROW_NUMBER];
    PrintWriter CSV = null;
   // int rowNum = 0;
    int colNum = 0;
 
    public PoiXLStest()
    {
            this("c:\\ExceltoCSV.csv");
    }
    
    public PoiXLStest(String F)
    {
        try{
            CSV = new PrintWriter(new BufferedWriter((new
FileWriter(F))));
        }
        catch(FileNotFoundException E){}
        catch(IOException E){}
    }
    
    public PoiXLStest(InputStream in)
    {   
        PoiXLStest noargs = new PoiXLStest();
        HSSFRequest req = new HSSFRequest();
//        req.addListener(noargs, SSTRecord.sid);
//        req.addListener(noargs, LabelSSTRecord.sid);
//        req.addListener(noargs, RowRecord.sid);
//        req.addListener(noargs, NumberRecord.sid);
//        req.addListener(noargs, BlankRecord.sid);
        req.addListenerForAllRecords(noargs);
        HSSFEventFactory factory = new HSSFEventFactory();
       
        try{
            factory.processEvents(req, in);
        }
        catch(IOException E){
            System.out.println("Problem in constructor");
        }
    }
    
    public void processRecord(Record record)  
    {
        short sidVal;
        sidVal = record.getSid();
 
        if(sidVal == RowRecord.sid){
            RowRecord rowRec = (RowRecord) record;
            if(rowRec.getRecordSize() > 0){
                rowRecLen[rowRec.getRowNumber()] = rowRec.getLastCol();
// Setting array to hold the row at it's physical position with its
Column length.
                //System.out.println(rowRec.getLastCol());
            }

        }
 
        if(sidVal == SSTRecord.sid){
            sstrec = (SSTRecord) record;
        }
 
        if(sidVal == LabelSSTRecord.sid){
            LabelSSTRecord lrec = (LabelSSTRecord) record;
                if(lrec.getColumn() < (rowRecLen[lrec.getRow()] - 1) &&
!(lrec.getColumn() < colNum)){
                    //System.out.print(lrec.getColumn());
                    CSV.print(sstrec.getString(lrec.getSSTIndex()) +
",");
                    colNum++;
                }
                else if(lrec.getColumn() < colNum){
                    //System.out.print(lrec.getColumn());
                    CSV.print("\n" +
sstrec.getString(lrec.getSSTIndex()) + ",");
                    colNum = 1;  
                }
                else{
                    //System.out.print(lrec.getColumn());
                    CSV.print(sstrec.getString(lrec.getSSTIndex()) +
",\n");
                    colNum = 0;
                } 
            
        }
 

        if(sidVal == NumberRecord.sid){
            NumberRecord nrec = (NumberRecord) record;
                if(nrec.getColumn() < (rowRecLen[nrec.getRow()] - 1) &&
!(nrec.getColumn() < colNum)){
                    //System.out.print(nrec.getColumn());
                    CSV.print(nrec.getValue() + ",");
                    colNum++;
                }
                else if(nrec.getColumn() < colNum){
                    //System.out.print(nrec.getColumn());
                    CSV.print("\n" + nrec.getValue() + ",");     
                    colNum = 1;
                    }
                    else{
                        //System.out.print(nrec.getColumn());
                        CSV.print(nrec.getValue() + ",\n");
                        colNum = 0;
                    }
            
        } 
 

        if(sidVal == BlankRecord.sid){
            BlankRecord brec = (BlankRecord) record;
            System.out.println("Blank record encounterred.  Column= " +
(brec.getColumn()+ 1));
        }
        
        
       CSV.flush();
    }
    
     public static void main(String[] args) throws IOException,
FileNotFoundException
     {  
        POIFSFileSystem wbook = new POIFSFileSystem(new
FileInputStream("c:\\evensmallertest.xls"));
        InputStream docIn = new
BufferedInputStream(wbook.createDocumentInputStream("Workbook"));  //if
the file has a 'read only recommendation' this will fail.
        PoiXLStest start = new PoiXLStest(docIn);
        docIn.close();
        System.out.println("\n**********  Finished Processing File
***********");
        
     }
    
   
}  
 
 
 
 
 
 

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/


Mime
View raw message