lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Using hundreds of dynamic fields
Date Wed, 16 Jul 2014 19:36:31 GMT
I guess I'm just a big fan of simpler and cleaner data models! Especially if 
I were to have to look at somebody's data model and try to make sense out of 
it, such as how to keep all the fields straight for constructing queries.

But atomic update and the need to read and rewrite all the fields is a 
concern as well.

-- Jack Krupansky

-----Original Message----- 
From: Andy Crossen
Sent: Wednesday, July 16, 2014 1:05 PM
To: solr-user@lucene.apache.org
Subject: Re: Using hundreds of dynamic fields

Thanks, Jack and Jared, for your input on this.  I'm looking into whether
parent-child relationships via block or query time join will meet my
requirements.

Jack, I noticed in a bunch of other posts around the web that you've
suggested to use dynamic fields in moderation.  Is this suggestion based on
negative performance implications of having to read and rewrite all
previous fields for a document when doing atomic updates?  Or are there
additional inherent negatives to using lots of dynamic fields?

Andy


On Fri, Jun 27, 2014 at 11:46 AM, Jared Whiklo <Jared.Whiklo@umanitoba.ca>
wrote:

> This is probably not the best answer, but my gut says that even if you
> changed your document to a simple 2 fields and have one as your metric and
> the other as a TrieDateField you would speed up and simplify your date
> range queries.
>
>
> --
> Jared Whiklo
>
>
>
> On 2014-06-27 10:10 AM, "Andy Crossen" <acrossen@gmail.com> wrote:
>
> >Hi folks,
> >
> >My application requires tracking a daily performance metric for all
> >documents. I start tracking for an 18 month window from the time a doc is
> >indexed, so each doc will have ~548 of these fields.  I have in my schema
> >a
> >dynamic field to capture this requirement:
> >
> ><dynamicField name=“metric_*” type="int" …/>
> >
> >Example:
> >metric_2014_06_24 : 15
> >metric_2014_06_25 : 21
> >…
> >
> >My application then issues a query that:
> >a) sorts documents by the sum of the metrics within a date range that is
> >variable for each query;
> >b) gathers stats on the metrics using the Statistics component.
> >
> >With this design, the app must unfortunately:
> >a) construct the sort as a long list of fields within the spec’d date
> >range
> >to accomplish the sum; e.g. 
> >sort=sum(metric_2014_06_24,metric_2014_06_25…)
> >desc
> >b) specify each field in the range independently to the Stats component;
> >e.g. stats.field=metric_2014_06_24&stats.field=metric_2014_06_25…
> >
> >Am I missing a cleaner way to accomplish this given the requirements
> >above?
> >
> >Thanks for any suggestions you may have.
>
> 


Mime
View raw message