poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MSB <markbrd...@tiscali.co.uk>
Subject Re: Q: How to check if a Word .doc file is a mail merge master file?
Date Thu, 26 Feb 2009 16:59:41 GMT

Hello Christian,

I would guess that the answer to your second question is yes. It is possible
to use HWPF to extract the data from a Word document - in fact Nick has
built a class that does just this and it is called WordExtractor I think. It
returns an array of Strings if I remember correctly and it would not be too
difficult to imagine that you could check the complete set of values
returned and if - only if - that complete set was limited to your 'table
structure' (if I understand that correctly) then the document would pass
your validation test.

To answer your first question, I need to ask another one; what set or
criteria distinguish a mail merge master file from any other document or
document template that could be created using Word? If you are able to
formulate such a list then it would be possible to determine if HWPF could
be used to parse the Word file and determine it's status.

Christian Gosch-2 wrote:
> Is it possible using POI to check if a given Word *.doc file 
> (Word2K/2003) is a Mail Merge master file?
> Is it then possible to retrieve or find by inspection the mail merge 
> data field references used in the mail merge master file?
> We do not need to change anything, we just want to check if a given file 
> is a valid mail merge master and matches a given and known "table 
> structure", i. e. uses only a given set of mail merge data field 
> references. (validation)
> Up to now, our validation just checks the file extension and does not 
> execute any introspection.
> Thanks for answers,
> -- 
> Dipl.-Inform. Christian Gosch, PMI PMP
> Systems Architecture, Project Management
> inovex GmbH
> Büro Pforzheim
> Karlsruher Strasse 71
> D-75179 Pforzheim
> Tel: +49 (0)7231 3191-85
> Fax: +49 (0)7231 3191-91
> c.gosch@inovex.de
> www.inovex.de
> Sitz der Gesellschaft: Pforzheim
> AG Mannheim, HRB 502126
> Geschäftsführer: Stephan Müller 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org

View this message in context: http://www.nabble.com/Q%3A-How-to-check-if-a-Word-.doc-file-is-a-mail-merge-master-file--tp22220571p22228552.html
Sent from the POI - User mailing list archive at Nabble.com.

To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org

View raw message