Return-Path: X-Original-To: apmail-poi-user-archive@www.apache.org Delivered-To: apmail-poi-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 33EF510D3B for ; Mon, 10 Jun 2013 08:33:08 +0000 (UTC) Received: (qmail 28095 invoked by uid 500); 10 Jun 2013 08:33:08 -0000 Delivered-To: apmail-poi-user-archive@poi.apache.org Received: (qmail 27366 invoked by uid 500); 10 Jun 2013 08:32:59 -0000 Mailing-List: contact user-help@poi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "POI Users List" Delivered-To: mailing list user@poi.apache.org Received: (qmail 27353 invoked by uid 99); 10 Jun 2013 08:32:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Jun 2013 08:32:56 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=HTML_FONT_FACE_BAD,HTML_MESSAGE,MIME_BASE64_BLANKS,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of cbamford@mimecast.com designates 91.220.42.101 as permitted sender) Received: from [91.220.42.101] (HELO service-alpha-uk.mimecast.com) (91.220.42.101) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Jun 2013 08:32:49 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mimecast.com; s=20130419; t=1370853148; bh=maTXKzNLTeN4O2OgRshyEB7d1qChMgiEEZ5yY4PCKKc=; h=From:To:Subject:Date:Message-ID:References:In-Reply-To:MIME-Version:Content-Type; b=HVxJvfiN5s/W21l6wAVOQjVMBVXku1mU2owtNc9uhSIjqc1DDyFWZaxlUR+LnUEvgjHTBpA1ALAry7nfyrQhgNBfk8+mTJr8TgebxPo6vpjQa3TViArVwfBpoC2JRipg81Pd9RFwqHm2npkWp2rj2u4skQ0boeVN4ZdDCrNqnUg= Received: from remote.mimecast.com (146.101.202.133 [146.101.202.133]) (Using TLS) by uk-sl-b.uk.mimecast.lan; Mon, 10 Jun 2013 09:32:25 +0100 Received: from MC-LON-EXCH03.mcsltd.internal ([fe80::3879:e7a7:5e3d:3699]) by MC-LON-EXCH03.mcsltd.internal ([fe80::3879:e7a7:5e3d:3699%15]) with mapi id 14.02.0342.003; Mon, 10 Jun 2013 09:32:24 +0100 From: Chris Bamford To: POI Users List Subject: Re: Extracting embedded files from HWPF docs Thread-Topic: Extracting embedded files from HWPF docs Thread-Index: AQHOY3sDjf13rPt9P06YzSCYX9djBJkqKbqAgAADyICABFlUAA== Date: Mon, 10 Jun 2013 08:32:24 +0000 Message-ID: <3F979228-FF02-4BAD-937C-EB231367893F@mimecast.com> References: <1363741413002-5712398.post@n5.nabble.com> <281B2E19-403E-4A2E-AC9B-E8508C8D30F5@mimecast.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [192.168.66.120] MIME-Version: 1.0 X-MC-Unique: 7d83ab01-f7bf-4cb0-84e3-a3bf87edde6a-1 Content-Type: multipart/related; boundary="MCBoundary=_11306100932270011" X-Virus-Checked: Checked by ClamAV on apache.org --MCBoundary=_11306100932270011 Content-Type: multipart/alternative; boundary="_000_3F979228FF024BAD937CEB231367893Fmimecastcom_" --_000_3F979228FF024BAD937CEB231367893Fmimecastcom_ Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Hi Nick, I created a .doc file with an embedded MP3 (that is, I dragged an MP3 file = from Finder and dropped it into the document whereupon Word displayed a sma= ll image of a loudspeaker - I took this as a positive sign!). I then added some text for good measure and saved it, taking care to save i= t as "Word 97 - 2004". Then I ran POIFSLister -sizes on it and got: Root Entry - SummaryInformation <(0x05)SummaryInformation> [4096 / 0x1000] DocumentSummaryInformation <(0x05)DocumentSummaryInformation> [4096 / 0x1= 000] WordDocument [9152 / 0x23c0] 1Table [7280 / 0x1c70] CompObj <(0x01)CompObj> [96 / 0x60] Looking closer in the debugger, I discovered that none of the entries shown= are of type DirectoryNode, so I cannot even start the process of finding /= extracting the MP3. Any ideas what I might be doing wrong? Thanks, - Chris Thanks Nick, must have missed that. Will check it out. Chris On 7 Jun 2013, at 14:12, Nick Burch wrote: > On Fri, 7 Jun 2013, Chris Bamford wrote: >> Is there a way to extract files embedded into Word docs (.doc, not .docx= ), using the HWPF package? > > Does the information on http://poi.apache.org/poifs/embeded.html not cove= r what you need? > > Nick On 7 Jun 2013, at 14:26, Chris Bamford wrote: Thanks Nick, must have missed that. Will check it out. Chris On 7 Jun 2013, at 14:12, Nick Burch wrote: > On Fri, 7 Jun 2013, Chris Bamford wrote: >> Is there a way to extract files embedded into Word docs (.doc, not .docx= ), using the HWPF package? > > Does the information on http://poi.apache.org/poifs/embeded.html not cove= r what you need? > > Nick=0A=0A=0AChris Bamford=0ASenior Developer=0A=0ACityPoint, One Ropemak= er Street,=20=0ALondon,=20=0AEC2Y 9AW.=0A=0Amobile +44 7860 405292=0Atel: += 44 (0) 207 847 8700=0Aweb www.mimecast.com=0A=0A=0AThe information containe= d in this communication from cbamford@mimecast.com is confidential and may = be legally privileged. It is intended solely for use by user@poi.apache.org= and others authorized to receive it. If you are not user@poi.apache.org yo= u are hereby notified that any disclosure, copying, distribution or taking = action in reliance of the contents of this information is strictly prohibit= ed and may be unlawful.=0A=0A=0AMimecast Ltd. is a company registered in En= gland and Wales with the company number 4698693 VAT No. GB 123 4197 34=0ARe= gistered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9A= W Email Address: info@mimecast.com=0A=0AThis email message has been scanned= for viruses by Mimecast.=0AMimecast delivers a complete managed email solu= tion from a single web based platform.=0AFor more information please visit = http://www.mimecast.com=0A --_000_3F979228FF024BAD937CEB231367893Fmimecastcom_ Content-Type: text/html; charset=WINDOWS-1252 Content-ID: <05C2FA2F1BE007418A73B83DDB0A2FAB@mimecast.com> Content-Transfer-Encoding: quoted-printable =0A= =0A=0A=0A=0A =0A =0A =0A =0A
=0A =0A =0A =0A
= =0A=0A=0A=0A Hi Nick,

I created a .doc file with an embedded MP3 (that is, I dragged an MP3 = file from Finder and dropped it into the document whereupon Word displayed = a small image of a loudspeaker - I took this as a positive sign!).
I then added some text for good measure and saved it, taking care to s= ave it as "Word 97 - 2004".
Then I ran POIFSLister -sizes on it and got:

Root Entry -
  Summary= Information <(0x05)SummaryInformation> [4096 / 0x1000]
  Documen= tSummaryInformation <(0x05)DocumentSummaryInformation> [4096 / 0x1000= ]
  WordDoc= ument [9152 / 0x23c0]
  1Table = [7280 / 0x1c70]
  CompObj= <(0x01)CompObj> [96 / 0x60]

Looking closer in the debugger, I discovered that none of the entries = shown are of type DirectoryNode, so I cannot even start the process of finding / extracting the MP3.
Any ideas what I might be doing wrong?
Thanks,

- Chris 
 
Thanks Nick, must have missed that. Will check it out.
Chris
On 7 Jun 2013, at 14:12, Nick Burch wrote:
> On Fri, 7 Jun 2013, Chris Bamford wrote:
>> Is there a way to extract files embedded into Word docs (.doc, not= .docx), using the HWPF package?

> Does the information on http://= poi.apache.org/poifs/embeded.html not cover what you need?

> Nick


On 7 Jun 2013, at 14:26, Chris Bamford wrote:

Thanks Nick, must have missed that. Will = check it out.

Chris

On 7 Jun 2013, at 14:12, Nick Burch wrote:

> On Fri, 7 Jun 2013, Chris Bamford wrote:
>> Is there a way to extract files embedded into Word docs (.doc, not= .docx), using the HWPF package?
> 
> Does the information on http://poi.apache.org/poifs/embeded.h= tml not cover what you need?
> 
> Nick

=0A =0A =0A =0A =20=0A =20=0A=
=0A=0A
= =0A=0A=0A=0A
=0A=0A=0A
=0A =20= =0A =20=0A =0A =0A = =0A =0A =0A =0A =0A =0A =0A = =0A =0A =
<= table width=3D"508">=0A
=0A = =20=0A
=0A =0A = =0A = =0A =0A
[= =0A Our Blog=0A= ]   [=0A Twitter=0A ]   [=0A = YouTube=0A = ]
=
=0A =0A = = =0A =0A
 
=0A = =0A =0A =0A =0A =0A = =0A = =0A
=0A =0A = =0A = =0A =0A = =0A =0A
=0A = Chris Bamford=0A
=0A Senior Developer <= /td>=0A
m:=0A = +44 7860 405292=0A
=0A = www.mimecast.com=0A
=0A
=0A =0A = =0A =0A = =0A =0A =0A = =0A
=0A = CityPoint, One Ropemaker Street, London, EC2Y 9AW.=0A =
+44 (0) 207 847 8700
=0A =0A =0A =20=0A<= br />=0A

Disclaimer
=0A The info= rmation contained in this communication from
=0A = cbamford@mimecast.com=0A sent at=0A = 2013-06-10=0A 09:32:27=0A is confide= ntial and may be legally privileged. It is intended solely for=20=0A = use by =0A user@poi.apache.org=0A = and others=20=0A authorized to receive it. If you = are not =0A user@poi.apache.org=0A you are hereby notified that any disclosure, copying, distribution or=20= =0A taking action in reliance of the contents of this inform= ation is strictly=20=0A prohibited and may be unlawful.
=0A
=0A Mimecast Ltd= . is a company registered in England and Wales with the company number 4698= 693 VAT No. GB 123 4197 34
=0A Registered Office: Ci= tyPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW
Email Addre= ss: info@mimecast.com

=0A
=0A T= his=20=0A email message has been scanned for viruses by Mimeca= st.
=0A Mimecast delivers a complete managed email soluti= on from a single web=20=0A based platform.
=0A = For more information please visit http://www.mimecast.com
=0A

=0A = =20=0A
=0A=0A --_000_3F979228FF024BAD937CEB231367893Fmimecastcom_-- --MCBoundary=_11306100932270011--