poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <vivian...@emc.com>
Subject RE: detect format using POI
Date Thu, 26 Aug 2010 17:13:40 GMT
Thank you very much, Nick!

-----Original Message-----
From: Nick Burch [mailto:nick.burch@alfresco.com] 
Sent: Thursday, August 26, 2010 9:55 AM
To: POI Users List
Subject: Re: detect format using POI

On Thu, 26 Aug 2010, vivian.li@emc.com wrote:
> Are there utilities in POI to detect a file's format (suppose the file 
> has no dos extension), at least for office files? If so can somebody 
> please point me to the spot?

If you're really not sure at all of the format, use Apache Tika:
 	http://tika.apache.org/0.7/detection.html

POI has org.apache.poi.extractor.ExtractorFactory, which will pick the 
correct kind of text extractor for any supported file format, which may 
get you close. From a POIOLE2TextExtractor or POIXMLTextExtractor you can 
get at the underlying open document

Nick


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Mime
View raw message