Return-Path: Delivered-To: apmail-poi-dev-archive@www.apache.org Received: (qmail 10300 invoked from network); 5 Jan 2009 02:16:10 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 5 Jan 2009 02:16:10 -0000 Received: (qmail 25577 invoked by uid 500); 5 Jan 2009 02:16:09 -0000 Delivered-To: apmail-poi-dev-archive@poi.apache.org Received: (qmail 25548 invoked by uid 500); 5 Jan 2009 02:16:09 -0000 Mailing-List: contact dev-help@poi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "POI Developers List" Delivered-To: mailing list dev@poi.apache.org Received: (qmail 25537 invoked by uid 99); 5 Jan 2009 02:16:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 04 Jan 2009 18:16:09 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of adb@teamware.com designates 212.226.92.15 as permitted sender) Received: from [212.226.92.15] (HELO monkey.teamware.com) (212.226.92.15) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Jan 2009 02:16:01 +0000 Received: from intrepid.teamw.com (intrepid.teamw.com [10.142.128.11]) by monkey.teamware.com (8.13.8/8.13.8) with ESMTP id n052HL1A026935 for ; Mon, 5 Jan 2009 04:17:22 +0200 Received: from mobile.teamware.com ([212.226.92.17]) by nimitz.teamw.com with ESMTP id mwcdz0cl; 5 Jan 2009 04:15:00 +0200 Message-ID: <49616D3A.1010605@teamware.com> Date: Mon, 05 Jan 2009 13:15:22 +1100 From: Antony Bowesman User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: POI Developers List Subject: Text extractor meta data X-TWG-MDN: never Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (monkey.teamware.com [212.226.92.15]); Mon, 05 Jan 2009 04:17:22 +0200 (EET) X-Teamware-Monkey-MailScanner-Information: Please contact the ISP for more information X-Teamware-Monkey-MailScanner-ID: n052HL1A026935 X-Teamware-Monkey-MailScanner: Found to be clean X-Teamware-Monkey-MailScanner-From: adb@teamware.com X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No I'm using POI3.5b4 and using ExtractorFactory to get an extractor for various types of MS document. I see the OOXML does not yet support meta data, but for the OLE variants I'm having trouble getting the meta data in a simple way. The only method in the returned POITextExtractor is getText(), which gives a line delimeted String of the PID_XXX = value, so I have to parse the strings out and match them against the PropertyIDMap names. Alternatively, I can cast the returned extractor to POIOLE2TextExtractor and then get the SI and DSI from there, but I simply then want to get certain properties from that. I don't want to have to write code to do things like getAuthor(), as the required properties are driven from external config. The getProperty() method is protected for some reason, but the getProperties() is not. What's the recommended way to get the properties I want? Cheers Antony --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org For additional commands, e-mail: dev-help@poi.apache.org