| Message view | « Date » · « Thread » |
|---|---|
| Top | « Date » · « Thread » |
| From | Michael Prichard <michael_prich...@mac.com> |
| Subject | extracting non-english text from word, pdf, etc....?? |
| Date | Wed, 01 Aug 2007 05:44:50 GMT |
I know how to do english text with POI and PDFBox and so on. Now, I want to start indexing non-english language such as french and spanish. Which extraction libs are available for me? I want to do: Excel Word PowerPoint PDF HTML RTF Thanks! Michael --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org | |
| Mime |
|
| View raw message | |