poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject [Bug 60567] XSSFReader caused OutOfMemoryError when reading a large excel file in HDFS as inputStream
Date Thu, 19 Jan 2017 08:30:43 GMT
https://bz.apache.org/bugzilla/show_bug.cgi?id=60567

Javen O'Neal <onealj@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|XSSFReader caused           |XSSFReader caused
                   |OutOfMemoryError when       |OutOfMemoryError when
                   |reading a lerge excel file  |reading a large excel file
                   |in HDFS as inputStream      |in HDFS as inputStream
           Severity|blocker                     |normal

--- Comment #1 from Javen O'Neal <onealj@apache.org> ---
1,000,000 rows is massive. That's nearly the maximum number of rows allowed per
the file format specification.

140 MB file size is massive. Keep in mind that this is zipped XML files, and I
would expect 90-95% compression for these files. Unzip this on your hard drive
to see how much disk space is consumed when you expand it. It should be in the
neighborhood of 1-3 GB.

You're also opening the file via an input stream, which has some memory
overhead.

Therefore, 3.25 GB of memory consumption is reasonable in this case,
considering input stream overhead, memory alignment, garbage collection,
temporary files for unzipping, maintaining references to files in the unzipped
directory structure, creating XML trees for the minimum files needed for
XSSFReader.

If you have any suggestions and could contribute a patch towards lowering
XSSFReader's memory footprint, we'd greatly appreciate the help.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


Mime
View raw message