lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Leir <>
Subject Re: 1 main collection or multiple smaller collections?
Date Thu, 27 Apr 2017 17:01:15 GMT
Does it make sense to use nested documents here? Products could be nested in a supplier document

Alternately, consider de-normalizing "til it hurts". A product doc might be able to contain
supplier info.

On April 27, 2017 8:50:59 AM EDT, Shawn Heisey <> wrote:
>On 4/26/2017 11:57 PM, Derek Poh wrote:
>> There are some common fields between them.
>> At the source data end (database), the supplier info and product info
>> are updated separately. In this regard, I should separate them?
>> If it's In 1 single collection, when there are updatesto only the
>> supplier info,the product info will be index again even though there
>> is noupdates to them, Is my reasoning valid?
>> On 4/27/2017 1:33 PM, Walter Underwood wrote:
>>> Do they have the same fields or different fields? Are they updated
>>> separately or together?
>>> If they have the same fields and are updated together, I’d put them
>>> in the same collection. Otherwise, probably separate. 
>Walter's statements are right on the money, you just might need a
>more detail.
>There are are two critical details that decide whether you even CAN
>combine different data in a single index: One is that all types of
>records must use the same field (the uniqueKey field) to determine
>uniqueness, and the value of this field must be unique across the
>dataset.  The other is that there SHOULD be a field with a name like
>"type" that your search client can use to differentiate the different
>kinds of documents.  This type field is not necessary, but it does make
>things easier.
>Assuming you CAN combine documents, there is still the question of
>whether you SHOULD.  If the fields that you will commonly search are
>same between the different kinds of documents, and if people want to be
>able to do one search and get more than one of the document types you
>are indexing, then it is something you should consider.  If people will
>only ever search one type of document, you should probably keep them in
>separate indexes to keep things cleaner.

Sorry for being brief. Alternate email is rickleir at yahoo dot com 
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message