lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4591) Make StoredFieldsFormat more configurable
Date Sat, 08 Dec 2012 15:07:21 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527160#comment-13527160
] 

Adrien Grand commented on LUCENE-4591:
--------------------------------------

I had a look at CompressingStoredFieldsWriter and I think that having a different encoding/compression
strategy per field would deserve a different StoredFieldsFormat impl (this is a discussion
we had in LUCENE-4226, but in that case I think we could open up CompressingStoredFieldsIndexWriter/Reader).
However I was thinking that if you don't mind adding one or two extra random seeks, maybe
you could reuse it without extending it, like
{code}
MyCustomStoredFieldsWriter {

  StoredFieldsWriter defaultSfw; // the default Lucene 4.1 stored fields writer

  writeField(FieldInfo info, StorableField field) {
    if (isStandard(field)) {
      defaultSfw.writeField(info, field);
    } else {
      // TODO: custom logic writing non-standard fields to another IndexOutput
    }
  }

}
{code}

and similarly for the reader

{code}
MyCustomStoredFieldsReader {

  StoredFieldsReader defaultSfr; // the default Lucene 4.1 stored fields reader

  void visitDocument(int n, StoredFieldVisitor visitor) {
    // visit standard fields
    defaultSfr.visitDocument(n, visitor);
    // TODO then visit specific fields
  }

}
{code}

Would it work for your use-case?
                
> Make StoredFieldsFormat more configurable
> -----------------------------------------
>
>                 Key: LUCENE-4591
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4591
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>    Affects Versions: 4.1
>            Reporter: Renaud Delbru
>             Fix For: 4.1
>
>         Attachments: LUCENE-4591.patch
>
>
> The current StoredFieldsFormat are implemented with the assumption that only one type
of StoredfieldsFormat is used by the index.
> We would like to be able to configure a StoredFieldsFormat per field, similarly to the
PostingsFormat.
> There is a few issues that need to be solved for allowing that:
> 1) allowing to configure a segment suffix to the StoredFieldsFormat
> 2) implement SPI interface in StoredFieldsFormat 
> 3) create a PerFieldStoredFieldsFormat
> We are proposing to start first with 1) by modifying the signature of StoredFieldsFormat#fieldsReader
and StoredFieldsFormat#fieldsWriter so that they use SegmentReadState and SegmentWriteState
instead of the current set of parameters.
> Let us know what you think about this idea. If this is of interest, we can contribute
with a first path for 1).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message