poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 48425] New: DateUtil.isCellDateFormatted() method is slow
Date Mon, 21 Dec 2009 13:18:32 GMT

           Summary: DateUtil.isCellDateFormatted() method is slow
           Product: POI
           Version: 3.6
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: POI Overall
        AssignedTo: dev@poi.apache.org
        ReportedBy: jan.stette@gmail.com

I have done some performance testing for code reading data from large
spreadsheets using POI.  In this use case, I found that half of the CPU time
was spent in a single method in POI: DateUtil.isCellDateFormatted(cell).  We
call this method every time we extract a value from a cell in order to
correctly create Date objects when cells contain dates.

Looking at this method, it spends most of its time in DateUtil.isADateFormat().
 This method is very slow, as it performs seven regular expression
substitutions on the formatString parameter and one additional regex match. 
None of the regexes are precompiled, so they're all compiled on every call to
this method.

I would suggest replacing the first five regexes with calls to a string
substitution method that doesn't require regexes, as they are simple
replacements.  For the remaining three regexes, I would suggest precompiling
them instead of just calling String.replaceAll() and String.matches().

Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org

View raw message