lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4883) Hide FieldCache behind an UninvertingFilterReader
Date Tue, 26 Mar 2013 12:33:15 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13613730#comment-13613730
] 

Robert Muir commented on LUCENE-4883:
-------------------------------------

I've been thinking about this in a lot of detail over the last few months, so I have a few
more ideas (i'm not sure if this is all really the best/easiest path):

Currently FC "uses" the docvalues apis, but violates them in a couple of ways. I was trying
to think of ways we could do this long term that would give us a filterreader that would also
pass checkindex. If we can do this, its nice as someone could call IndexWriter.addIndexes(ir)
and "upgrade" from fieldcache to docvalues. But unfortunately I think its a good deal of work
and not easy to do immediately.

Anyway I think these are the three trickiest parts:

# How can we make the FilterReader's fieldinfos consistent with the docvalues types? I think
it needs to take this information up-front: a mapping of field names from the underlying fieldinfos
to docvalues types. Note that this would also make fieldcache "type insanity" impossible.
It also allows a possibility for someone to easily control which fields are allowed to have
fieldcaches built for them.
# How can we prevent non-dense ordinals (e.g. the case where someone "sorts on a multivalued
field"). In this case today lucene happily allows it, but with a typed-no-insanity-filterreader
i think we should throw an exception in this case instead. It means someone specified the
incorrect docvalues type for the field (should have been SORTED_SET). Also in the filterreader's
ctor, we can try to use underlying statistics on the field to detect if any fields are actually
multivalued up front and throw exception early.
# How can we expose "missing" for NumericDocValues. One idea is just to see this "bitset"
as another NumericDocValues field (that only has values of 0 or 1) and provide sugar in the
API that makes this happen automatically. I think actually for SortedDocValues we should try
to move things to the same thing long-term (instead of returning -1).

                
> Hide FieldCache behind an UninvertingFilterReader
> -------------------------------------------------
>
>                 Key: LUCENE-4883
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4883
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>            Priority: Minor
>         Attachments: LUCENE-4883.patch
>
>
> From a discussion on the mailing list:
> {{
> rmuir:
> I think instead FieldCache should actually be completely package
> private and hidden behind a UninvertingFilterReader and accessible via
> the existing AtomicReader docValues methods.
> }}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message