lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Male (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
Date Mon, 29 Aug 2011 04:20:45 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092625#comment-13092625
] 

Chris Male commented on LUCENE-3312:
------------------------------------

With LUCENE-2308 out of the way, I've started looking into this more deeply.  Changing the
indexer code has not been especially difficult since there is already a clear separation in
the handling of indexed and stored fields.  The challenges are in the consumer / user code.
So I have a couple of questions I'm hoping for some opinions on:

- Due to the fact that FieldInfo is maintained per field name, if an IndexableField and StorableField
are added to a Document separately but with the same name, a single FieldInfo will be created
noting the field is both indexed and stored.  This isn't a problem, however a lot of code
used to leverage this fact to get metadata about indexed Fields using searcher.document(docId).
 They would retrieve all the stored fields and then see which were also indexed (and associated
metadata).  This seems like a bit of a hack, piggybacking stored fields to find out about
their indexing attributes.  So I guess it cannot continue to go forward? When you pull the
StorableFields, you should only be able to access the stored value metadata? 

- By creating this separation, we will need some notion of a Document in index.* which provides
Iterable access to both the IndexableFields and StorableFields.  As such, Document itself
is becoming more userland.  However by letting it store Indexable and StorableFields separately,
the functionality it provides (getBinaryValue for example) becomes quite verbose because it
must provide an implementations of both kinds of fields.  Given that Field is a userland implementation
of both Indexable and StorableField, should Document work solely with Fields? or should we
allow people to register both kinds of fields separately and just have a verbose set of functionality?

> Break out StorableField from IndexableField
> -------------------------------------------
>
>                 Key: LUCENE-3312
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3312
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>             Fix For: Field Type branch
>
>
> In the field type branch we have strongly decoupled
> Document/Field/FieldType impl from the indexer, by having only a
> narrow API (IndexableField) passed to IndexWriter.  This frees apps up
> use their own "documents" instead of the "user-space" impls we provide
> in oal.document.
> Similarly, with LUCENE-3309, we've done the same thing on the
> doc/field retrieval side (from IndexReader), with the
> StoredFieldsVisitor.
> But, maybe we should break out StorableField from IndexableField,
> such that when you index a doc you provide two Iterables -- one for the
> IndexableFields and one for the StorableFields.  Either can be null.
> One downside is possible perf hit for fields that are both indexed &
> stored (ie, we visit them twice, lookup their name in a hash twice,
> etc.).  But the upside is a cleaner separation of concerns in API....

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message