lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <k...@r.email.ne.jp>
Subject Re: How to modify a document Field before the document is indexed?
Date Mon, 19 Jul 2010 23:12:57 GMT
(10/07/20 7:31), Joe Hansen wrote:
> Hey All,
>
> I am using Apache Lucene (2.9.1) and its fast and it works great! I
> have a question in connection with Apache PDFBox.
>
> The following command creates a Lucent Document from a PDF file:
> Document document =
> org.apache.pdfbox.searchengine.lucene.LucenePDFDocument.getDocument(docFile);
>
> The Lucene Document, document, has a bunch of fields. Among those
> fields, is a field named, "content". I need to add some more data to
> that field. For example, I would like to add some description and
> keywords. How do I go about doing that? Any pointers would be greatly
> welcome! :)
>
> Thanks for your time!
>
> Regards,
> Joe
>
>    
Joe,

You can add your data to the document object:

Document document =
org.apache.pdfbox.searchengine.lucene.LucenePDFDocument.getDocument(docFile);
document.add( new Field( "content", "your data", Store.YES, 
Index.ANALYZED ) );

http://lucene.apache.org/java/2_9_3/api/all/org/apache/lucene/document/Document.html#add%28org.apache.lucene.document.Fieldable%29

Koji

-- 
http://www.rondhuit.com/en/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message