lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields
Date Thu, 10 Dec 2009 23:08:18 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789003#action_12789003
] 

Chris A. Mattmann edited comment on SOLR-1131 at 12/10/09 11:08 PM:
--------------------------------------------------------------------

Hi All:

Here's a cut on the patch. Some questions/comments on the existing patch(es):

# Why use 
{code}
private DynamicField[] dynamicFields;
{code}

instead of 
{code}
List<DynamicField>
{code}
or 
{code}
Collection<DynamicField>
{code}

in IndexSchema?

# There are a bunch of useless whitespace changes (e.g., in IndexSchema, FieldType) in the
existing patches. The final patch probably shouldn't include those since it makes it difficult
to understand what was actually changed.
# IndexSchema:
#* when checking for isDuplicateDynField, if it is, nothing is done. Shouldn't this be where
an exception is thrown or a message is logged? In the patch I'm attaching I took the log approach.
# IndexSchema:
#* what happens if subs.isEmpty() == true?
#* maybe log message that says, dyn field definition is up to you?
#* I took the approach in my attached patch to log it.
# Why does getPolyFieldType(String) throw an exception if the field is not a poly field type
-- that seems a bit brittle? Also there's the NoEx version anyways (why not just keep that
one?). In the patch I've attached, I took the approach of only including a getPolyFieldType
that returns null rather than throwing an ex (the NoEx version).
# CoordinateFieldType: why process > 1 sub field types and then throw an exception at the
end? I cleaned this up to throw the Exception when it occurs.
# parsePoint in DistanceUtils, why use ',' as the separator -- use ' ' (at least conforms
to georss point then). I guess because you are supporting N-dimensional points, right?
# parsePoint -- instead of complicated isolation loops, why not just use trim()? I've taken
that approach in the patch I've attached.

This patch passes all unit tests as well. This doesn't implement option C that I proposed
yet. Hopefully I'll get a chance to put that up later tonight.



      was (Author: chrismattmann):
    Hi All:

Here's a cut on the patch. Some questions/comments on the existing patch(es):

1. Why use 
{code}
private DynamicField[] dynamicFields;
{code}

instead of 
{code}
List<DynamicField>
{code}
or 
{code}
Collection<DynamicField>
{code}

in IndexSchema?

2. There are a bunch of useless whitespace changes (e.g., in IndexSchema, FieldType) in the
existing patches. The final patch probably shouldn't include those since it makes it difficult
to understand what was actually changed.
3. IndexSchema:
    - when checking for isDuplicateDynField, if it is, nothing is done. Shouldn't this be
where an exception is thrown or a message is logged? In the patch I'm attaching I took the
log approach.
4. IndexSchema:
    - what happens if subs.isEmpty() == true?
    - maybe log message that says, dyn field definition is up to you?
    - I took the approach in my attached patch to log it.
5. Why does getPolyFieldType(String) throw an exception if the field is not a poly field type
-- that seems a bit brittle? Also there's the NoEx version anyways (why not just keep that
one?). In the patch I've attached, I took the approach of only including a getPolyFieldType
that returns null rather than throwing an ex (the NoEx version).
6. CoordinateFieldType: why process > 1 sub field types and then throw an exception at
the end? I cleaned this up to throw the Exception when it occurs.
7. parsePoint in DistanceUtils, why use ',' as the separator -- use ' ' (at least conforms
to georss point then). I guess because you are supporting N-dimensional points, right?
8. parsePoint -- instead of complicated isolation loops, why not just use trim()? I've taken
that approach in the patch I've attached.

This patch passes all unit tests as well. This doesn't implement option C that I proposed
yet. Hopefully I'll get a chance to put that up later tonight.


  
> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt,
SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed
as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed
in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message