lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-2308) Separately specify a field's type
Date Sat, 18 Jun 2011 13:29:47 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051512#comment-13051512
] 

Michael McCandless commented on LUCENE-2308:
--------------------------------------------

Patch looks good, thanks Nikola!

When you make the patch, can you run "svn diff" from the top-level
dir?  Ie, so that file paths look
lucene/src/java/org/apache/lucene/document/Field.java

A couple minor code-formatting things:

  * Please add { } around one-line ifs, eg in FieldType.toString

  * import lines go after the copyright (FieldType.java)

  * If possible please try to avoid adding "noise" to the patch, for
    example re-formatting javadocs (eg NumericField.java).  It's fine
    to clean things up (add missing {}'s to existing code) as you go,
    but if it's simply a reformat that just adds noise which makes it
    harder to see real changes.

Other stuff:

  * The DEFAULT_TYPE for each field can be final right?

  * For FieldType, can we use direct members of the class, instead of
    the EnumSet?  (Ie, boolean indexed, boolean stored, etc.).

The patch causes compilation errors when I run "ant compile-core", but
that's expected right?

I think our immediate goal here should be to get a compilable patch
with tests passing, ie the "dirt path".  Then we can go back and
iterate.

But, because so many tests rely on the current Document/Field API... I
think in order to stage this we should make a totally new package,
call it document2 for now, and create all these new classes inside
there.  Then, one by one we can cutover tests to use document2/*,
starting with TestDemo.  Eventually, once everything is cutover, we
can remove document and rename document2 to document.


> Separately specify a field's type
> ---------------------------------
>
>                 Key: LUCENE-2308
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2308
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>             Fix For: 4.0
>
>         Attachments: LUCENE-2308-2.patch, LUCENE-2308.patch, LUCENE-2308.patch
>
>
> This came up from dicussions on IRC.  I'm summarizing here...
> Today when you make a Field to add to a document you can set things
> index or not, stored or not, analyzed or not, details like omitTfAP,
> omitNorms, index term vectors (separately controlling
> offsets/positions), etc.
> I think we should factor these out into a new class (FieldType?).
> Then you could re-use this FieldType instance across multiple fields.
> The Field instance would still hold the actual value.
> We could then do per-field analyzers by adding a setAnalyzer on the
> FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise
> for per-field codecs (with flex), where we now have
> PerFieldCodecWrapper).
> This would NOT be a schema!  It's just refactoring what we already
> specify today.  EG it's not serialized into the index.
> This has been discussed before, and I know Michael Busch opened a more
> ambitious (I think?) issue.  I think this is a good first baby step.  We could
> consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold
> off on that for starters...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message