lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (LUCENE-1597) New Document and Field API
Date Wed, 19 Sep 2012 20:03:07 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael McCandless resolved LUCENE-1597.
----------------------------------------

    Resolution: Duplicate

I think it's more or less dup'd w/ LUCENE-2308 ... we can open new issues for any differences.
                
> New Document and Field API
> --------------------------
>
>                 Key: LUCENE-1597
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1597
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/index
>            Reporter: Michael Busch
>            Assignee: Michael Busch
>            Priority: Minor
>             Fix For: 4.1
>
>         Attachments: lucene-new-doc-api.patch
>
>
> This is a super rough prototype of how a new document API could look like. It's basically
what I came up with during a long flight across the Atlantic :)
> It is not integrated with anything yet (like IndexWriter, DocumentsWriter, etc.) and
heavily uses Java 1.5 features, such as generics and annotations.
> The general idea sounds similar to what Marvin is doing in KS, which I found out by reading
Mike's comments on LUCENE-831, I haven't looked at the KS API myself yet. 
> Main ideas:
> - separate a field's value from its configuration; therefore this patch introduces two
classes: FieldDescriptor and FieldValue
> - I was thinking that in most cases the documents people add to a Lucene index look alike,
i.e. they contain mostly the same fields with the same settings. Yet, for every field instance
the DocumentsWriter checks the settings and calls the right consumers, which themselves check
settings and return true or false, indicating whether or not they want to do something with
that field or not. So I was thinking we could design the document API similar to the Class<->Object
concept of OO-languages. There a class is a blueprint (as everyone knows :) ), and an object
is one instance of it. So in this patch I introduced a class called DocumentDescriptor, which
contains all FieldDescriptors with the field settings. This descriptor is given to the consumer
(IndexWriter) once in the constructor. Then the Document "instances" are created and added
via addDocument().
> - A Document instance allows adding "variable fields" in addition to the "fixed fields"
the DocumentDescriptor contains. For these fields the consumers have to check the field settings
for every document instance (like with the old document API). This is for maintaining Lucene's
flexibility that everyone loves.
> - Disregard the changes to AttributeSource for now. The code that's worth looking at
is contained in a new package "newdoc".
> Again, this is not a "real" patch, but rather a demo of how a new API could roughly work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message