lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jochen Barth <Ba...@ub.uni-heidelberg.de>
Subject Re: Stored vs non-stored very large text fields
Date Tue, 29 Apr 2014 20:16:35 GMT
Ok, https://wiki.apache.org/solr/SolrPerformanceFactors

states that: "Retrieving the stored fields of a query result can be a  
significant expense. This cost is affected largely by the number of  
bytes stored per document--the higher byte count, the sparser the  
documents will be distributed on disk and more I/O is necessary to  
retrieve the fields (usually this is a concern when storing large  
fields, like the entire contents of a document)."

But in my case (with docValues=true) there should be no reason to  
access *.fdt.

Kind regards,
Jochen

Zitat von Jochen Barth <Barth@ub.uni-heidelberg.de>:

> Something is really strange here:
>
> even when configuring fields id + sort_... to docValues="true" -- so  
> there's nothing to get from "stored documents file" -- performance  
> is still terrible with ocr stored=true _even_ with my patch which  
> stores uncompressed like solr4.0.0 (checked with strings -a on *.fdt).
>
> Just reading  
> http://lucene.472066.n3.nabble.com/Can-Solr-handle-large-text-files-td3439504.html ..
perhaps things will clear up soon (will check if spltting to index+non-stored and non-indexed+stored
could help  
> here)
>
>
> Kind regards,
> J. Barth
>
>
> Zitat von Shawn Heisey <solr@elyograg.org>:
>
>> On 4/29/2014 4:20 AM, Jochen Barth wrote:
>>> BTW: stored field compression:
>>> are all "stored fields" within a document are put into one  
>>> compressed chunk,
>>> or by per-field basis?
>>
>> Here's the issue that added the compression to Lucene:
>>
>> https://issues.apache.org/jira/browse/LUCENE-4226
>>
>> It was made the default stored field format for Lucene, which also made
>> it the default for Solr.  At this time, there is no way to remove
>> compression on Solr without writing custom code.  I filed an issue to
>> make it configurable, but I don't know how to do it.  Nobody else has
>> offered a solution either.  One day I might find some time to take a
>> look at the issue and see if I can solve it myself.
>>
>> https://issues.apache.org/jira/browse/SOLR-4375
>>
>> Here's the author's blog post that goes into more detail than the LUCENE
>> issue:
>>
>> http://blog.jpountz.net/post/33247161884/efficient-compressed-stored-fields-with-lucene
>>
>> Thanks,
>> Shawn



Mime
View raw message