poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: xlsx somewhat recently switched to Scientific notation for long sequences of digits?
Date Wed, 29 Jun 2016 18:09:28 GMT
Got it. I realize there's a double under the hood for all numbers in POI and Excel, and I agree
that you can't have a 16 digit numeric in Excel...that would have to be stored as a string/text
cell in Excel.

The question has more to do with a change in formatting for the < 16 digit numerics.  

With poi-3.13, FormatTrackingHSSFListener's formatNumberDateCell(number) for these numbers
yielded "340229177292566".  

With poi-3.15-beta1, we're getting "3.40229E+14".  Again, to be fair, "3.40229E+14" is exactly
what Excel displays if the columns are of a certain width, so in some ways this is progress.

The question: is there an easy way for us to get the old behavior?

-----Original Message-----
From: Javen O'Neal [mailto:javenoneal@gmail.com] 
Sent: Wednesday, June 29, 2016 11:13 AM
To: POI Users List <user@poi.apache.org>
Subject: Re: xlsx somewhat recently switched to Scientific notation for long sequences of

Excel and POI don't make a distinction between double/decimal and int. Does Excel make any
guarantees that doubles won't have precision issues?

16-digit credit cards are not storable as 32-bit ints, but require 64-bit longs.
On Jun 29, 2016 5:53 AM, "Allison, Timothy B." <tallison@mitre.org> wrote:

  On https://issues.apache.org/jira/browse/TIKA-2025, a Tika user noted that, at least for
xlsx, what used to be rendered as a long sequence of digits (e.g. 340229177292566) is now
being extracted as scientific notation (3.40229E+14).  This new behavior mimics Excel more
closely, but is there an easy/obvious way for us at the Tika level to revert back to extracting
the full sequence of digits or do I have to look into this at the POI level?
  Thank you.


View raw message