lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Darren Vengroff" <vengr...@gmail.com>
Subject RE: dynamicField + copyField = dynamicCopyField?
Date Wed, 07 Jun 2006 03:04:45 GMT
Sounds good, I will check it out.

-D

-----Original Message-----
From: Yonik Seeley [mailto:yseeley@gmail.com] 
Sent: Tuesday, June 06, 2006 6:26 PM
To: solr-dev@lucene.apache.org
Subject: Re: dynamicField + copyField = dynamicCopyField?

Darren, I'm not sure how familiar you are with Lucene, but if you are
using dynamicFields (or a lot of indexed fields), check out the
omitNorms attribute.

Even if your indexed field values are sparse (contained only on a few
documents), Lucene keeps a 1 byte norm for each document in the index
for each indexed field by default.  That can really add up depending
on what you are doing.

I've considered making it the default for non-text fields in the
schema, but I'm worried that some people might try index-time boosts,
and they won't work w/o norms.

-Yonik

On 6/5/06, Darren Vengroff <vengroff@gmail.com> wrote:
> Dear Solrians,
>
> I've been looking through the IndexSchema and DocumentBuilder code, hoping
> to find something that is essentially a combination of what dynamicField
and
> copyField do.  Specifically, I'd like to say that any field that matches a
> given pattern should be copied to a specific multivalued destination
field.
> For example, my config might contain something like:
>
>     <!-- All text fields get thrown in here for default indexing. -->
>     <field name="text" type="text" indexed="true" stored="false"
> multiValued="true"/>
>
>     <!-- Search text by default so we get all text fields. -->
>     <defaultSearchField>text</defaultSearchField>
>
>     <!--
>         Any field ending in _t is assumed to be a text field and is thrown
> into text,
>         despite the fact that we may not have known of its specific
> existence when
>         we created the config file.
>     -->
>     <dynamicCopyField source="*_t" dest="text" />
>
> The advantage of this approach over pure copyField is that we don't have
to
> know the full set of text fields up front when we are writing the config
> file.  If a field name matches both a dynamicField and a dynamicCopyField,
> then both behaviors should probably occur, although config files will not
> typically be written this way.
>
> Has anything like this been discussed or implemented before?
>
> Looking forward to your feedback.
>
> Thanks,
>
> -D
>
>
>


-- 
-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server


Mime
View raw message