lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Vizisa <chris.viz...@gmail.com>
Subject Re: NRT updates
Date Tue, 07 Jun 2016 01:24:39 GMT
Hi,
Any pointers, suggestions, experiences ... please..

Thanks!
Chris.

On Mon, Jun 6, 2016 at 10:27 AM, Chris Vizisa <chris.vizisa@gmail.com>
wrote:

> Hi,
>
> Does number of fields in a document affect NRT updates?
> I have around 1.6 million products. Each product can be available in about
> 3000 stores.
> In addition to around 50 fields related to a product I am storing
> product_store info in each product document like:
>  1. Quantity of that product in each store (store_n1_count,
> store_n2_count,..., store_3000_count)
>  2. status of that product in each store (store_n1_status,
> store_n2_status,.....store_3000_status)
>
> I would need to do NRT update on count and status of each product, and
> like that there are around 1.6 million products.
>
> Q1. Is it okay to do NRT updates on this product collection (for each
> product's store_count and store_status) with around 900 updates per second
>       across the different products, (pls note that each product's status
> as well as count gets updated, like that there are 1.6M products)
> Q2. Is it okay using atomic updates for the NRT updates of multiple
> store_counts and multiple store_status of each product and like that around
>     1.6 million products in total. Or is there any other optimal way to
> handle this amount of dynamic data change.
>     For atomic updates I understand all fields need to be stored.
> Q3. So basically can I have all this info in product collection itself or
> should I store store_status info separately with productId joining them
>     for the NRT scenario to work best. In that case each product_store
> info is a separate document, with 3 or 4 fields only but many million
>     documents (worst case 1.6M products multiplied by 3000 stores).
> Q4.    When we embed all store related info in the product doc itself, a
> single product doc
>        can be a candidate for simultaneous updates as its count or status
> can change in
>        different stores at the same time. If we go for a separate
> collection depicting
>        product_status info, only one doc updated at a time mostly.
>        Which is more efficient and optimized.?
>
>
> Could some one please suggest what is optimal. Any pointers welcome.
>
> Thanks!
> Chris.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message