poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Law" <david....@apconsult.de>
Subject Re: HSSFCell short/int
Date Thu, 03 Dec 2009 13:48:27 GMT
Hi Josh,

thanks for that thorough explanation.
My formulation was rather ambiguous.

The need for migrating to int is clear.

My issue is that for Rows it makes sense in the HSSF world to
convert to "unsigned short" by AND'ing with 65535, as that is
the maximum no. of rows, but for columns the max is 256, so it
is really quite misleading, because the signed range of the
short is sufficient.

The HSSFCell.getCell(short) code...
int ushortCellNum = cellnum & 0x0000FFFF; // avoid sign extension
return getCell(ushortCellNum);
...raises an issue with the beholder that is not relevant.

Maybe I can illustrate that with an example:
I actually noticed this while migrating code from the deprecated
method to the int method. My first shot was to AND my short with
65535 before calling the int method, so you see, I was forced to
deal with an issue that was irrelevant. My code would of course
have worked with the AND, but its not necessary.

I would replace the 2 lines with...
return getCell((int)cellnum);

In any case, I would think its better to give an out-of-range error
that mentions the value supplied & not its unsigned equivalent.

Regards,
DaveLaw

----- original message --------

Subject: Re: HSSFCell short/int
Sent: Wed, 02 Dec 2009
From: Josh Micich<josh.micich@gmail.com>

> Hello Dave,
> 
> It's not clear if you are questioning the merit of creating the newer
> method "getCell(int)", or whether the old method "getCell(short)"
> should have been deprecated at all, so I'll attempt to answer both
> concerns.
> 
> 
> The most visible reason why POI is moving away from bytes and shorts
> is the need for typecasts when using literal integer constants, even
> when the value is quite obviously within range.  For example without
> the new method "row.getCell(5)" would not compile.  We would have to
> write "row.getCell((short)5)".
> 
> There have been several POI bugs involving representation of unsigned
> (ushort, ubyte) quantities with java datatypes (signed) of the same
> width (short, byte).  When shorts and bytes are used, the corrected
> code generally needs more typecasts and bit mask expressions.   If
> those datatypes are changed to int, the code is usually simpler and
> easier to read.
> 
> The new method "getCell(int) method may in the first place have been
> introduced as a result of the XSSF common interface work (a separate
> reason, as suggested by Mark B).
> 
> 
> That sort-of explains the rationale for creating the new method
> "getCell(int)".   The reason for deprecating the old is to eliminate
> overloading (of "getCell").  Method overloading is sometimes very
> handy but has quite a few pitfalls, and for that reason POI tries to
> use overloading sparingly.
> 
> As far as the exact meaning of the '@deprecated' tag, in POI (unlike
> in the java runtime library) it means "This method will be removed in
> a future POI version".  An alternative is always given.  A deprecation
> date has been provided in most places, to help POI users prioritize
> clean-up of their own code.
> 
> 
> You mentioned the comment from the the first line of the deprecated method:
>         int ushortCellNum = cellnum & 0x0000FFFF; // avoid sign extension
> The bit-mask is correct for 16-bit unsigned conversion.  Masking with
> 0xFF would be wrong because it would silently convert every possible
> input to a valid column index.  For example: is it sensible for
> "getCell((short)300)" to succeed?   Maybe it looks pointless to
> convert values in the range (-32768...-1) to (32768...65535) is
> because the maximum column index is 255.  This conversion to unsigned
> 16-bit has been done because it's more likely that the caller intends
> an unsigned quantity (column indexes are never negative).  So if a
> negative value *does* get into this method, it's probably better to
> format the error message in terms of the unsigned 16 bit quantity.
> Similar results could have been achieved with bounds checking the
> argument, but would involve duplicating all the MissingCellPolicy
> logic too.
> So - that line of code is there for making better error messages, and
> has little to do with the reasons for the  deprecation.
> 
> regards,
> Josh
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 

--- original message end ----


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Mime
View raw message