lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gregor Dorfbauer <>
Subject Index Field feeded from Reader that also stores cleartext
Date Fri, 03 Sep 2010 09:23:34 GMT

I'm working on an indexer that should process documents on hard-disk 
which are of arbitrary size and type. I use Apache Tika for plain text 
extraction which offers the feature to stream the parsers output through 
a reader.

My problem is following:
Is there a possibility to generate a document field that gets its data 
from an Reader-instance and where the plain text is also stored into the 
index (like the Store.YES field denotes)?
If I can't stream the data, memory usage is exceeding the limits of my 

Thanks for your help,

View raw message