lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Size of Document
Date Wed, 04 Jul 2018 16:08:56 GMT
But does size on disk help? If the doc has a zillion
images in it, those aren't part of the resulting index
(I'm excluding stored data here)....

On Wed, Jul 4, 2018 at 7:49 AM, Terry Steichen <terry@net-frame.com> wrote:
> In the document types I usually index (.pdf, .docx/.doc, .eml), there
> exists a metadata field called "stream_size" that contains the size of
> the document on disk.  You don't have to compute it.  Thus, when you
> retrieve each document you can pull out the contents of this field and,
> if you like, include it in each hitlist entry.
>
>
> On 07/04/2018 05:26 AM, Chris and Helen Bamford wrote:
>> Hi there,
>>
>> How can I calculate the total size of a Lucene Document that I'm about
>> to write to an index so I know how many bytes I am writing please?  I
>> need it for some external metrics collection.
>>
>> Thanks
>>
>> - Chris
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message