lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jafarim <>
Subject Storing whole document in the index
Date Sat, 17 Mar 2007 12:36:32 GMT
It's a whil that I am using lucene and as most of people seemingly do, I
used to save only some important fields of a docuemnt in the index. But
recently I thought why not store the whole document bytes as an untokenized
field in the index in order to ease the retrieval process? For example
serialize the pdf file into a byte[] and then save the bytes as a field in
the index.(some gzip and base64 encodings may be needed as glue logic). Then
I can delete the original file from the system. Is there any reason against
this idea? Can lucene bear this large volume of input streamed data?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message