poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Fisher <dfis...@jmlafferty.com>
Subject Re: Problem reading XLS files
Date Tue, 26 Jan 2010 16:14:30 GMT
Hi -

You can get blank cells anywhere in an Excel file. This can happen if  
the Excel user visits some cell way to the right of the table. There  
is no guarantee that you won't find blank cells in the middle.

Is it possible for you to have the number of columns be an input? Or,  
follow the Mark's suggestion let the header determine the number of  
columns and then make sure you always provide that many and only that  
many for each row.

Regards,
Dave

On Jan 26, 2010, at 5:29 AM, Aju Mathai wrote:

> Hi ,
>
> I am using a pipe character as a column delimiter . The problem is
> with blank cells in the excel file .
>
> Thanks
>
> On Tue, Jan 26, 2010 at 6:48 PM, Fermin Da Costa Gomez
> <dacostagomez@dcgconsultancy.nl> wrote:
>> Hi,
>>
>> Why don't you have a look at the code i posted a couple of days ago  
>> re. a
>> XLS2CSV export.
>> Instead of a , you could used the pipe character.
>> It also works on a per row basis and i have not seen any strange  
>> things yet
>> re. the header. This is no guarantee off course .. ;-)
>>
>> Just a thought.
>>
>> Gl
>>
>> Fermin Dcg
>>
>> On Tue, Jan 26, 2010 at 1:59 PM, Aju Mathai <mathaiaju@gmail.com>  
>> wrote:
>>
>>> Hi Mark,
>>>
>>> I guess we cannot make the assumption regarding the header cells  
>>> also.
>>> These are files generated by a third party vendor , so cannot
>>> guarantee this.
>>>
>>> Thanks & Regards,
>>> mathaj
>>>
>>> On Tue, Jan 26, 2010 at 6:24 PM, MSB <markbrdsly@tiscali.co.uk>  
>>> wrote:
>>>>
>>>> This will be a very quick answer as I have only just called into  
>>>> the
>>> office
>>>> before going on to another site.
>>>>
>>>> Can you safely assume that there will be no blank columns in the  
>>>> header
>>> row?
>>>> By this, I mean will all of the columns have headings? If so,  
>>>> then you
>>> know
>>>> in advance the higest possible column number (from the right most  
>>>> column)
>>>> and if you find a blank cell that has a column number that is  
>>>> higher than
>>>> this then you have reached the end of the row and have  
>>>> encountered one of
>>>> these additional cells.
>>>>
>>>> Will give the problem some more thought whilst I am working this
>>> afternoon
>>>> and if anything else pops into my mind, I will post. But I think  
>>>> it is
>>> safe
>>>> to say that this is the type of solution you will need to look for.
>>>>
>>>> Yours
>>>>
>>>> Mark B
>>>>
>>>>
>>>> Aju Mathai wrote:
>>>>>
>>>>> Hi Mark,
>>>>>
>>>>> What i am trying to accomplish is read a row of data from the  
>>>>> excel
>>>>> file and delimit the columns with a pipe "|" character. This row  
>>>>> of
>>>>> data is passed on for processing . My problem is i need to find  
>>>>> the
>>>>> last column , if say i have 2 columns in my excel file my output  
>>>>> will
>>>>> be
>>>>>
>>>>> ColumnHeader1|ColumnHeader2|
>>>>> TextColumn1|TextColumn2|
>>>>>
>>>>> But suppose for the 3rd column in my excel sheet also returns me  
>>>>> true
>>>>> (output of the celliterator.hasNext()) .My code below will give  
>>>>> me a
>>>>> output
>>>>>
>>>>> ColumnHeader1|ColumnHeader2|
>>>>> TextColumn1|TextColumn2||
>>>>>
>>>>> An extra pipe character at the end. How should i handle this  
>>>>> scenario
>>>>> , i am interested in all cell types string,numeric,boolean etc
>>>>>
>>>>>
>>>>> Thanks & Regards,
>>>>> mathaj
>>>>> On Mon, Jan 25, 2010 at 10:21 PM, MSB <markbrdsly@tiscali.co.uk>
 
>>>>> wrote:
>>>>>>
>>>>>> You would need to use an additional tool - BiffViewer - to  
>>>>>> accomplish
>>>>>> thais
>>>>>> and it may well not be wirth while in this case.
>>>>>>
>>>>>> Can I ask, do the extra columns matter to you? Cannot you  
>>>>>> simply ignore
>>>>>> the
>>>>>> additional blank cells that appear? It looks from the loop that 

>>>>>> you
>>> have
>>>>>> posted that the only cell tyep you are interested in are the  
>>>>>> String(s).
>>>>>>
>>>>>> Yours
>>>>>>
>>>>>> Mark B
>>>>>>
>>>>>>
>>>>>> Aju Mathai wrote:
>>>>>>>
>>>>>>>  Hi Mark,
>>>>>>>>
>>>>>>>> The file is generated automatically without any user  
>>>>>>>> intervention.
>>> How
>>>>>>>> do i find out whether the excel file has been modified in
the  
>>>>>>>> ways
>>> you
>>>>>>>> talk about ? . The POI version that i am using is 3.5_beta3.
>>>>>>>>
>>>>>>>> Thanks & Regards,
>>>>>>>> mathaj
>>>>>>>
>>>>>>> On Fri, Jan 22, 2010 at 1:12 PM, MSB <markbrdsly@tiscali.co.uk>
>>> wrote:
>>>>>>>>
>>>>>>>> Excel can behave quite oddly at times and it will create
a  
>>>>>>>> record for
>>> a
>>>>>>>> cell
>>>>>>>> if it has been 'touched' in any way by the user during  
>>>>>>>> creation of
>>> the
>>>>>>>> file.
>>>>>>>> By this, I mean that if the user entered a value into a cell
 
>>>>>>>> and then
>>>>>>>> cleared that out or if they set a format and then removied
 
>>>>>>>> it, Excel
>>>>>>>> will
>>>>>>>> recognise that something was done to the cell and will create
a
>>> record
>>>>>>>> for
>>>>>>>> it in the file. Without seeing the actual file/files you
are  
>>>>>>>> working
>>>>>>>> with
>>>>>>>> I
>>>>>>>> cannot comment much further but do suspect that this may
have  
>>>>>>>> been
>>> the
>>>>>>>> case;
>>>>>>>> either the user or the application that created the file
left  
>>>>>>>> a few
>>> of
>>>>>>>> these
>>>>>>>> artifacts around. I do not expect that this is a problem
with  
>>>>>>>> POI but
>>>>>>>> to
>>>>>>>> be
>>>>>>>> certain, can you post one of the files that is exhibiting
 
>>>>>>>> this sort
>>> of
>>>>>>>> behaviour so that we could take a closer look at it please?
 
>>>>>>>> Further,
>>>>>>>> can
>>>>>>>> you
>>>>>>>> say which version of the API you are using and how the Excel
 
>>>>>>>> file is
>>>>>>>> generated - by someone using Excel or by another application
-
>>> please?
>>>>>>>>
>>>>>>>> Yours
>>>>>>>>
>>>>>>>> Mark B
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Aju Mathai wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I am using the following POI libraries to read in the
files  
>>>>>>>>> with
>>> .xls
>>>>>>>>> and .xlsx files. I found a strange problem ,the
>>> celliterator.hasNext()
>>>>>>>>> sometimes gives true even if the next column in my xls
file  
>>>>>>>>> seems to
>>>>>>>>> be blank, i have no additional columns. How do i take
care  
>>>>>>>>> of such
>>>>>>>>> blank cells ?
>>>>>>>>>
>>>>>>>>> Here is my code snippet
>>>>>>>>>
>>>>>>>>>                     for(Iterator<Cell> ri =  
>>>>>>>>> row.cellIterator();
>>>>>>>>> ri.hasNext();)  {
>>>>>>>>>                         Cell cell = ri.next();
>>>>>>>>>
>>>>>>>>>                               switch(cell.getCellType()){
>>>>>>>>>                               case Cell.CELL_TYPE_STRING:
>>>>>>>>>
>>>>>>>>> text.append(cell.getRichStringCellValue().getString().trim());
>>>>>>>>>                                       break;
>>>>>>>>>                               case Cell.CELL_TYPE_NUMERIC:
>>>>>>>>>
>>>>>>>>> text.append(cell.getNumericCellValue());
>>>>>>>>>                                       break;
>>>>>>>>>                               case Cell.CELL_TYPE_BOOLEAN:
>>>>>>>>>
>>>>>>>>> text.append(cell.getBooleanCellValue());
>>>>>>>>>                                       break;
>>>>>>>>>                               case Cell.CELL_TYPE_FORMULA:
>>>>>>>>>
>>>>>>>>> text.append(cell.getCellFormula());
>>>>>>>>>                                       break;
>>>>>>>>>                               case Cell.CELL_TYPE_BLANK:
>>>>>>>>>                                       break;
>>>>>>>>>                               default:
>>>>>>>>>                                       text.append("");
>>>>>>>>>                               }
>>>>>>>>>                               // Column Delimiter
>>>>>>>>>                               if(ri.hasNext() &&
>>>  !(cell.getCellType()
>>>>>>>>> == Cell.CELL_TYPE_BLANK))
>>>>>>>>>                                        
>>>>>>>>> text.append(COLUMN_DELIMITER);
>>>>>>>>>                       }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> mathaj
>>>>>>>>>
>>>>>>>>>
>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>>>>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> View this message in context:
>>>>>>>>
>>> http://old.nabble.com/Problem-reading-XLS-files-tp27254834p27269628.html
>>>>>>>> Sent from the POI - User mailing list archive at Nabble.com.
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>>>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>>
>>> http://old.nabble.com/Problem-reading-XLS-files-tp27254834p27307989.html
>>>>>> Sent from the POI - User mailing list archive at Nabble.com.
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context:
>>> http://old.nabble.com/Problem-reading-XLS-files-tp27254834p27322196.html
>>>> Sent from the POI - User mailing list archive at Nabble.com.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>>
>>>
>>
>>
>> --
>> “The reasonable man adapts himself to the world; the unreasonable one
>> persists in trying to adapt the world to himself. Therefore all  
>> progress
>> depends on the unreasonable man.”
>> - George Bernard Shaw (1856 - 1950)
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Mime
View raw message