lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3855) DocValues support
Date Thu, 08 Nov 2012 12:25:12 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493138#comment-13493138
] 

Adrien Grand commented on SOLR-3855:
------------------------------------

{quote}
Regarding performance - it seems like for most users, the number of docvalue fields should
be relatively small.
One of the big advantages to DocValues is the better caching by the OS - so "seeks" should
often never hit the disk.
{quote}

I agree that it is unlikely to affect performance for many users but on the other hand I don't
like the fact that Solr could suddenly get insanely slow if doc values fields grow larger
than the size of the I/O cache. 

bq. stored=[docValues method] // store separately using the given method

I'm afraid it could be confusing for users: doc values are very different from stored fields
feature-wise (sorting, function values) and performance-wise (up to 1 seek per doc vs. up
to 1 seek per field) so I think we should use another parameter name?

bq. But it seems like that should just be a default and one should be able to access the field
via direct or memory depending on the situation?

To avoid surprises (OOM on the one hand / extreme slowness on the other hand) I think we should
stick to an explicit access method specified in the schema? (I've planned to fix SortField/FieldComparator
so that it doesn't force doc values to be memory-resident when sorting.)

The question of loading or not doc values fields by default seems to raise lots of concerns.
Maybe we should fix this issue with no promise that doc values fields would be loaded by default
and open another issue to find out whether it is reasonable or not to do so? (I'm just afraid
that consensus might be hard to obtain while everyone seems to agree that DocValues support
is an improvement?)
                
> DocValues support
> -----------------
>
>                 Key: SOLR-3855
>                 URL: https://issues.apache.org/jira/browse/SOLR-3855
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: 4.1, 5.0
>
>         Attachments: SOLR-3855.patch, SOLR-3855.patch
>
>
> It would be nice if Solr supported DocValues:
>  - for ID fields (fewer disk seeks when running distributed search),
>  - for sorting/faceting/function queries (faster warmup time than fieldcache),
>  - better on-disk and in-memory efficiency (you can use packed impls).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message