lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
Date Tue, 30 Aug 2011 07:22:38 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093499#comment-13093499
] 

Simon Willnauer commented on LUCENE-3312:
-----------------------------------------

bq. I'm almost done getting an initial patch for this, just one issue remaining - IndexDocValues.
IndexDocValues can be both not indexed and not stored. Therefore when you retrieve the indexed
fields and then the stored fields, you can miss some IndexDocValues. It seems to be that we
might need a 3rd interface to cover these fields?

To me it appears that we need some clarification what DocValues are. Actually, when you think
about it Stored Fields and DocValues have a lot in common. A Stored Field is basically a DocValues
DerefVarBytes type and maybe down the road we should think about merge those two types together.
It would be nice to have only one typesafe API that can store whatever you want and based
on the codec lucene would decide how to store it on disk ie. if it is a multi field container
like Stored Fields are done today or if the values are split appart like DocValues does it
today.
For now we should try to differentiate between and InvertedField and a StoredField ie. everything
which is not an InvertedField is a StoredField. The API could basically already reflect that
DocValues and StoredFields are the same and simply specify a type like Store.Packed vs. Store.ColumnStride
or something like that. If we do that we could also expose loading "Packed" Fields via PerDocValues
and have one API for our users.

> Break out StorableField from IndexableField
> -------------------------------------------
>
>                 Key: LUCENE-3312
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3312
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>             Fix For: Field Type branch
>
>
> In the field type branch we have strongly decoupled
> Document/Field/FieldType impl from the indexer, by having only a
> narrow API (IndexableField) passed to IndexWriter.  This frees apps up
> use their own "documents" instead of the "user-space" impls we provide
> in oal.document.
> Similarly, with LUCENE-3309, we've done the same thing on the
> doc/field retrieval side (from IndexReader), with the
> StoredFieldsVisitor.
> But, maybe we should break out StorableField from IndexableField,
> such that when you index a doc you provide two Iterables -- one for the
> IndexableFields and one for the StorableFields.  Either can be null.
> One downside is possible perf hit for fields that are both indexed &
> stored (ie, we visit them twice, lookup their name in a hash twice,
> etc.).  But the upside is a cleaner separation of concerns in API....

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message