lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kumar Gaurav" <>
Subject Does Lucene Java 2.3.2 supports parsing of Microsoft office 2007 documents...
Date Fri, 27 Jun 2008 11:08:21 GMT
Dear all,


Currently I am using Lucene jave 2.3.2 demo to parse Microsoft 2003 and 2007
docs and PDF files.

It is able to parse files with *.pdf, *.doc, *.xls etc. 

But it does not search in files of Microsoft 2007 docs.

It shows indexing *.docx and other Microsoft 2007 doc files.


Does Lucene java supports parsing of extensions *.docx, *.pptx, *.mpp i.e.
Microsoft Windows 2007 documents?

If it supports, what should be done in Lucene demo 2.3.2 to search queries
on file with above mentioned extensions?




  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message