lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ishan Chattopadhyaya (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-8220) Read field from docValues for non stored fields
Date Tue, 17 Nov 2015 05:08:11 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008058#comment-15008058
] 

Ishan Chattopadhyaya edited comment on SOLR-8220 at 11/17/15 5:07 AM:
----------------------------------------------------------------------

Here's an alternate patch (SOLR-8220-ishan.patch) for this functionality. Uses the same hooks
that you've created, i.e. decorateDocValueFields(), and uses your unit tests.
* Changed the handling of single valued dv fields to have less code,
* Adds support for multivalued dv fields support,
* In your patch, {{fl=*,mydvfield}} wasn't working. Fixed this.
* Added a '~' glob, similar to '*'. {{fl=~}} here means: return all conventional stored fields
and all non stored docvalues.

Some cleanup / refactoring and a few tests might be needed.


was (Author: ichattopadhyaya):
Here's an alternate patch (SOLR-8220-ishan) for this functionality. Uses the same hooks that
you've created, i.e. decorateDocValueFields(), and uses your unit tests.
* Changed the handling of single valued dv fields to have less code,
* Adds support for multivalued dv fields support,
* In your patch, {{fl=*,mydvfield}} wasn't working. Fixed this.
* Added a '~' glob, similar to '*'. {{fl=~}} here means: return all conventional stored fields
and all non stored docvalues.

Some cleanup / refactoring and a few tests might be needed.

> Read field from docValues for non stored fields
> -----------------------------------------------
>
>                 Key: SOLR-8220
>                 URL: https://issues.apache.org/jira/browse/SOLR-8220
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Keith Laban
>         Attachments: SOLR-8220-ishan.patch, SOLR-8220.patch, SOLR-8220.patch
>
>
> Many times a value will be both stored="true" and docValues="true" which requires redundant
data to be stored on disk. Since reading from docValues is both efficient and a common practice
(facets, analytics, streaming, etc), reading values from docValues when a stored version of
the field does not exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as they would
always be returned sorted in the docValues approach. I believe this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think it should
live closer to where stored fields are loaded in the SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues, facets, analytics,
streaming, etc, all seem to be doing their own ways, perhaps some of this logic should be
centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
>   -- return field from docValue if the field is not stored and in docValues, if the field
is stored return it from stored fields
> - fl="*"
>   -- return only stored fields
> - fl="+"
>    -- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first pass. 2b - is
current behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message