lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Donna L Gresh <gr...@us.ibm.com>
Subject Re: Indexing MSword Documents
Date Fri, 08 Jun 2007 17:52:41 GMT
I do this exact thing. "text" (the second input to the Field constructor) 
is MSWord text that I've extracted from the Word document

textField = new org.apache.lucene.document.Field(textFieldName,text, 
org.apache.lucene.document.Field.Store.NO,
 org.apache.lucene.document.Field.Index.TOKENIZED);
doc.add(textField);

Donna L. Gresh
Services Research, Mathematical Sciences Department
IBM T.J. Watson Research Center
(914) 945-2472
http://www.research.ibm.com/people/g/donnagresh
gresh@us.ibm.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message