poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Barry Lagerweij <blagerw...@gmail.com>
Subject [Bug 52949] New: How to extract VBA Macros code from Excel file by using POI?
Date Thu, 14 Mar 2013 19:54:12 GMT

I've been looking for a way to extract the source-code of VBA Modules and
macros using POI. Since POI does not provide access to this, I've written a
class which allows you to extract the sourcecode as text.

The two attached classes can be used together with POI (I've tested with
3.8 and 3.9) to process the vbaProject.bin (for ooxml) and XLS file and
retrieve the sources.

The RLEDecompressingInputStream is an InputStream which can be used to
decompress the chunks as described in the MS-OVBA specification. It wraps
around a compressed inputstream (ussually a DocumentInputStream from the
POIFS) and decompresses on the fly to preserve memory.

The VBAMacroExtractor processes the OLE binary stream records, records the
CodePage (in order to convert byte-arrays to Strings) and will store the
ModuleOffset. This offset specifies the location in the MemoryStream where
the sourcecode starts. The VBAMacroExtractor has been written to
automatically detect XLSM or XLS, and uses POIFSReader to process the file
only once and preserve memory.

It might be worthwhile to enhance the POI workbook with classes which
provide access to the VBA modules, see Andrey Yesyev's contributions to the
Nabble mailinglist.

I hope it's useful, feel free to use the sources under Apache2 license.

With kind regards,


View raw message