lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Atomic Updates : Performance Impact
Date Fri, 23 Feb 2018 19:17:53 GMT
bq: However if i dont have majority of other column data while doing update
operations, is it better to go with atomic update?

I don't understand what you're asking. To use Atomic Updates, _every_
original field (i.e. any field that is _not_ the destination of a
copyField directive) must be stored. That's just a basic requirement.

bq: And also during the update process, if there is a simultaneous search
request on the collection, will there be any lag in response?

This is just like any other update, the changes will be visible after
the next soft commit or hard-commmit-with-opensearcher-true.

Best,
Erick

On Fri, Feb 23, 2018 at 9:39 AM, Uday Jami <udayjami@gmail.com> wrote:
> Hello Erick,
>
> Thanks for the explanation.
> However if i dont have majority of other column data while doing update
> operations, is it better to go with atomic update?
>
> And also during the update process, if there is a simultaneous search
> request on the collection, will there be any lag in response?
>
>
> Thanks,
> Uday
>
> On Fri, Feb 23, 2018 at 10:47 PM, Erick Erickson <erickerickson@gmail.com>
> wrote:
>
>> The approximate amount of work will be very close to what it would
>> take Solr to just index the documents from a client. Actually it puts
>> a little _more_ of a load on Solr. In the case you do an Atomic
>> Update, Solr has to
>> 1> fetch all the stored fields from the index
>> 2> construct a Solr document
>> 3> change the values in the doc based on the atomic update
>> 4> re-index the doc just as though it had received it from a client.
>>
>> Whereas if you just send the doc from an external client Solr has to
>> 1> de-serialize the doc
>> 2> index it (identical to step 4 above)
>>
>> The sweet spot for Atomic Updates is when you can't easily get the
>> original document from the system-of-record.
>>
>> Best,
>> Erick
>>
>> On Fri, Feb 23, 2018 at 9:02 AM, Uday Jami <udayjami@gmail.com> wrote:
>> > Can you please let me know what will be the performance impact of trying
>> to
>> > update 120Million records in a collection containing 1 billion records.
>> > The collection contains around 30 columns and only one column out of it
>> is
>> > updated as part of atomic update.
>> > Its not a batch update, the 120 Million updates will happen within 24
>> hours.
>> >
>> > How is the search on the above collection going to get impacted during
>> the
>> > above update process.
>> >
>> > Thanks,
>> > Uday
>>

Mime
View raw message