poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject [Bug 55611] New: Performance improvement (~7% up to ~27%) by adding a cache to DateUtil.isADateFormat(int, String)
Date Mon, 30 Sep 2013 09:11:33 GMT

            Bug ID: 55611
           Summary: Performance improvement (~7% up to ~27%) by adding a
                    cache to DateUtil.isADateFormat(int, String)
           Product: POI
           Version: 3.9
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: HSSF
          Assignee: dev@poi.apache.org
          Reporter: luca.dellatoffola@inf.ethz.ch

Created attachment 30894
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=30894&action=edit
Patch for poi-3.9

We found an easy way to improve POI's performance. The idea is to avoid
in DateUtil.isADateFormat(int, String) if a given format string represents a
date format if the same string is passed multiple times. 
This can be done safely by adding a single-static-entry cache and check if the
parameters did change from the previous call, and in case invalidate the cache.
Our attached patch first checks that the format and format index are the same
as in the previous call, otherwise execute the real check and store the
required data.
For example, when running Poi 3.9 on a small document (~40 KB) and on a larger
document (~13.5 MB), the patch reduces the running time
giving a speedup of ~7% in the first case and ~12% in the second case.
Additionally we executed an experiment using this patch with Tika 1.3.
We ran a test with a set of nine documents (~13.9MB) obtaining a 27% speedup.

You are receiving this mail because:
You are the assignee for the bug.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org

View raw message