lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Derek Poh <d...@globalsources.com>
Subject Re: 1 main collection or multiple smaller collections?
Date Fri, 28 Apr 2017 02:25:33 GMT
Richard

Iam considering the sameoption asyour suggestion to put them in 1 single 
collection of products documents. A product doccontaining the supplier info.
In this option, a supplier info will get repeated in eachof the 
supplier's product doc.I may be influenced by DB concepts. Guess it's a 
trade off for this option.

On 4/28/2017 1:01 AM, Rick Leir wrote:
> Does it make sense to use nested documents here? Products could be nested in a supplier
document perhaps.
>
> Alternately, consider de-normalizing "til it hurts". A product doc might be able to contain
supplier info.
>
> On April 27, 2017 8:50:59 AM EDT, Shawn Heisey <apache@elyograg.org> wrote:
>> On 4/26/2017 11:57 PM, Derek Poh wrote:
>>> There are some common fields between them.
>>> At the source data end (database), the supplier info and product info
>>> are updated separately. In this regard, I should separate them?
>>> If it's In 1 single collection, when there are updatesto only the
>>> supplier info,the product info will be index again even though there
>>> is noupdates to them, Is my reasoning valid?
>>>
>>>
>>> On 4/27/2017 1:33 PM, Walter Underwood wrote:
>>>> Do they have the same fields or different fields? Are they updated
>>>> separately or together?
>>>>
>>>> If they have the same fields and are updated together, I’d put them
>>>> in the same collection. Otherwise, probably separate.
>> Walter's statements are right on the money, you just might need a
>> little
>> more detail.
>>
>> There are are two critical details that decide whether you even CAN
>> combine different data in a single index: One is that all types of
>> records must use the same field (the uniqueKey field) to determine
>> uniqueness, and the value of this field must be unique across the
>> entire
>> dataset.  The other is that there SHOULD be a field with a name like
>> "type" that your search client can use to differentiate the different
>> kinds of documents.  This type field is not necessary, but it does make
>> things easier.
>>
>> Assuming you CAN combine documents, there is still the question of
>> whether you SHOULD.  If the fields that you will commonly search are
>> the
>> same between the different kinds of documents, and if people want to be
>> able to do one search and get more than one of the document types you
>> are indexing, then it is something you should consider.  If people will
>> only ever search one type of document, you should probably keep them in
>> separate indexes to keep things cleaner.
>>
>> Thanks,
>> Shawn


----------------------
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information.
If you are not the intended recipient or have received this e-mail in error, please inform
the sender immediately and delete this e-mail (including any attachments) from your computer,
and you must not use, disclose to anyone else or copy this e-mail (including any attachments),
whether in whole or in part. 

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance
and/or other appropriate reasons.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message