lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gregor Dorfbauer <gregor.dorfba...@lagentz.at>
Subject Index Field feeded from Reader that also stores cleartext
Date Fri, 03 Sep 2010 09:23:34 GMT
 Hi!

I'm working on an indexer that should process documents on hard-disk 
which are of arbitrary size and type. I use Apache Tika for plain text 
extraction which offers the feature to stream the parsers output through 
a reader.

My problem is following:
Is there a possibility to generate a document field that gets its data 
from an Reader-instance and where the plain text is also stored into the 
index (like the Store.YES field denotes)?
If I can't stream the data, memory usage is exceeding the limits of my 
machine.


Thanks for your help,
Gregor

Mime
View raw message