lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Male (JIRA)" <>
Subject [jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
Date Fri, 01 Jun 2012 09:36:23 GMT


Chris Male commented on LUCENE-3312:

bq. With the Field class implementing IndexableField and StorableField, and on retrieval returning
a different class that implements only StorableField?

Yes, Nikola has included a StoredDocument class for that.  This would prevent users from thinking
they can just take a search result and pass it into being indexed.  It creates a clear separation
between indexing and search results.

bq. But the strong decoupling of stored/indexed parts of a field has its benefits too (arbitrary
sequences of stored/indexed parts of fields)... and if you require a specific implementation
at the level of (input) Document then you prevent users from using their own impls. of strongly
decoupled sequences of StoredField/IndexedField.

I agree that there are benefits to the decoupling.  It's just that one of the important factors
in this issue and other work in and around Document & Field is creating a cleaner API
for users.  I'm not sure bogging the document.Document API down with having to manage both
Storable and IndexableField instances is worth it.  Field is already basically a parent class
with the extensive list of specializations we now have.

I'm wondering whether expert users who are using their own Storable/IndexableField impls will
also want their own 'Document' impls as well, maybe to support direct streaming of fields
or something.  If we enforce this, then we're have a consistent policy that to use these expert
interfaces, you're going to have to provide your own implementations for everything.

With all that said, I'm open to a clean API in Document that can do everything :)
> Break out StorableField from IndexableField
> -------------------------------------------
>                 Key: LUCENE-3312
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Nikola Tankovic
>              Labels: gsoc2012, lucene-gsoc-12
>             Fix For: Field Type branch
>         Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch,
> In the field type branch we have strongly decoupled
> Document/Field/FieldType impl from the indexer, by having only a
> narrow API (IndexableField) passed to IndexWriter.  This frees apps up
> use their own "documents" instead of the "user-space" impls we provide
> in oal.document.
> Similarly, with LUCENE-3309, we've done the same thing on the
> doc/field retrieval side (from IndexReader), with the
> StoredFieldsVisitor.
> But, maybe we should break out StorableField from IndexableField,
> such that when you index a doc you provide two Iterables -- one for the
> IndexableFields and one for the StorableFields.  Either can be null.
> One downside is possible perf hit for fields that are both indexed &
> stored (ie, we visit them twice, lookup their name in a hash twice,
> etc.).  But the upside is a cleaner separation of concerns in API....

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message