Return-Path: X-Original-To: apmail-poi-dev-archive@www.apache.org Delivered-To: apmail-poi-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 225C6E0ED for ; Thu, 14 Mar 2013 19:54:41 +0000 (UTC) Received: (qmail 34248 invoked by uid 500); 14 Mar 2013 19:54:40 -0000 Delivered-To: apmail-poi-dev-archive@poi.apache.org Received: (qmail 34195 invoked by uid 500); 14 Mar 2013 19:54:40 -0000 Mailing-List: contact dev-help@poi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "POI Developers List" Delivered-To: mailing list dev@poi.apache.org Received: (qmail 34187 invoked by uid 99); 14 Mar 2013 19:54:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 19:54:40 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of blagerweij@gmail.com designates 209.85.214.54 as permitted sender) Received: from [209.85.214.54] (HELO mail-bk0-f54.google.com) (209.85.214.54) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 19:54:33 +0000 Received: by mail-bk0-f54.google.com with SMTP id w5so1215187bku.13 for ; Thu, 14 Mar 2013 12:54:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=6b2Tmq5tROAIx+GdY9oe7/4H66dAzF+aDbqUzFthTyI=; b=BrgDUS0Jy4cObEX7TynIWQ9v//BJg1qnwrVsh5X7xYMreoC+jZQuC8BrHFm1OYsdfv kHYtGDq2Yi42dkAFKHIJFIZgv+cEJD8Jat6qVRug05yPwFXB9WKRRCxtKOzKSV9m2bGM vRx3BmuxIBFAehg3XzX99am+FR14PBBzAnibq11wu7RK3z7GVNkEVShs3vqIeZGORurd O96Plaoc9nhFJB2Tv/tuK3ghCbZG/L+knRxtIE5ZwLQQ2ezPD8RMXqzLA/9Z/MbJtcbu 4y9v/y88vhMJ8xjAS7fF4ku8uOzY66xqv8AiZOJLSmzldD4P394a8Jnck5lpsnvnnGp1 8/cg== MIME-Version: 1.0 X-Received: by 10.205.24.17 with SMTP id rc17mr2056444bkb.23.1363290853081; Thu, 14 Mar 2013 12:54:13 -0700 (PDT) Received: by 10.205.82.143 with HTTP; Thu, 14 Mar 2013 12:54:12 -0700 (PDT) Date: Thu, 14 Mar 2013 20:54:12 +0100 Message-ID: Subject: [Bug 52949] New: How to extract VBA Macros code from Excel file by using POI? From: Barry Lagerweij To: dev@poi.apache.org Content-Type: multipart/mixed; boundary=20cf3033455d063f7c04d7e7e388 X-Virus-Checked: Checked by ClamAV on apache.org --20cf3033455d063f7c04d7e7e388 Content-Type: multipart/alternative; boundary=20cf3033455d063f7904d7e7e386 --20cf3033455d063f7904d7e7e386 Content-Type: text/plain; charset=ISO-8859-1 Hi, I've been looking for a way to extract the source-code of VBA Modules and macros using POI. Since POI does not provide access to this, I've written a class which allows you to extract the sourcecode as text. The two attached classes can be used together with POI (I've tested with 3.8 and 3.9) to process the vbaProject.bin (for ooxml) and XLS file and retrieve the sources. The RLEDecompressingInputStream is an InputStream which can be used to decompress the chunks as described in the MS-OVBA specification. It wraps around a compressed inputstream (ussually a DocumentInputStream from the POIFS) and decompresses on the fly to preserve memory. The VBAMacroExtractor processes the OLE binary stream records, records the CodePage (in order to convert byte-arrays to Strings) and will store the ModuleOffset. This offset specifies the location in the MemoryStream where the sourcecode starts. The VBAMacroExtractor has been written to automatically detect XLSM or XLS, and uses POIFSReader to process the file only once and preserve memory. It might be worthwhile to enhance the POI workbook with classes which provide access to the VBA modules, see Andrey Yesyev's contributions to the Nabble mailinglist. I hope it's useful, feel free to use the sources under Apache2 license. With kind regards, Barry --20cf3033455d063f7904d7e7e386 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi,

I've been looking for a w= ay to extract the source-code of VBA Modules and macros using POI. Since PO= I does not provide access to this, I've written a class which allows yo= u to extract the sourcecode as text.

The two attached classes can be used togeth= er with POI (I've tested with 3.8 and 3.9) to process the vbaProject.bi= n (for ooxml) and XLS file and retrieve the sources.

The RLEDecompressingInputStream is an InputStream which ca= n be used to decompress the chunks as described in the MS-OVBA specificatio= n. It wraps around a compressed inputstream (ussually a DocumentInputStream= from the POIFS) and decompresses on the fly to preserve memory.

The VBAMacroExtractor processes the OLE bin= ary stream records, records the CodePage (in order to convert byte-arrays t= o Strings) and will store the ModuleOffset. This offset specifies the locat= ion in the MemoryStream where the sourcecode starts. The VBAMacroExtractor = has been written to automatically detect XLSM or XLS, and uses POIFSReader = to process the file only once and preserve memory.

It might be worthwhile to enhance the POI w= orkbook with classes which provide access to the VBA modules, see Andrey Ye= syev's contributions to the Nabble mailinglist.

I hope it's useful, feel free to use the sources under= Apache2 license.

With kind regards,

Barry
--20cf3033455d063f7904d7e7e386-- --20cf3033455d063f7c04d7e7e388 Content-Type: text/plain; charset=us-ascii --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org For additional commands, e-mail: dev-help@poi.apache.org --20cf3033455d063f7c04d7e7e388--