lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-2621) Extend Codec to handle also stored fields and term vectors
Date Wed, 05 Oct 2011 12:48:35 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120899#comment-13120899
] 

Robert Muir commented on LUCENE-2621:
-------------------------------------

yeah another alternative name to FieldCodec would be something like PostingsFormat (or similar).

Because there is a big difference between PreFlex (which is actually a codec), and Memory/Pulsing.

as far as the current patch: yeah its missing a lot... because as soon as I started digging
here I thought, well we have to probably try to fix these codec classes first before going
further. Really I would like for some of the stuff like shared docstores to be private to
PreFlex codec, part of fixing the files() issue.

Same with the merging, this bulk copying of index inputs should be in codec as well. Currently
its not only wrong as you noted, but makes assumptions about the implementation. But i didn't
want to just shove this into CodecProvider since it doesnt really belong there.

Finally, I do think the CodecProvider has a place after we fix these names. But I dont think
it should be really any more than name -> Codec resolution... currently it does too much.
But to fix this, we really want to remove all the special per-field map, etc stuff it has...
and this means factoring PerFieldCodecWrapper back out into codecs (in my opinion this should
be PerFieldPostingsFormat, and just an 'ordinary' Codec). And for that to work correctly,
we need FieldInfos reading/writing under codec control so that this per-field stuff can be
private to PerFieldPostingsFormat....

So there is a ton to do, although I made a branch I'm kinda concerned about doing a bunch
of renaming and keeping things in sync... maybe I should ignore this though. But for now I've
been trying to figure out any way we can do this in individual incremental steps/issues directly
on trunk, its always nice to make progress that way.
                
> Extend Codec to handle also stored fields and term vectors
> ----------------------------------------------------------
>
>                 Key: LUCENE-2621
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2621
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>    Affects Versions: 4.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Robert Muir
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2621_rote.patch
>
>
> Currently Codec API handles only writing/reading of term-related data, while stored fields
data and term frequency vector data writing/reading is handled elsewhere.
> I propose to extend the Codec API to handle this data as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message