lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emir Arnautovic <emir.arnauto...@sematext.com>
Subject Re: Commit after every document - alternate approach
Date Wed, 02 Mar 2016 10:11:24 GMT
Hi Sangeetha,
What is sure is that it is not going to work - with 200-300K doc/hour, 
there will be >50 commits/second, meaning there are <20ms time for 
doc+commit.
You can do is let Solr handle commits and maybe use real time get to 
verify doc is in Solr or do some periodic sanity checks.
Are you doing document updates so in order Solr updates are reason why 
you commit each doc before moving to next doc?

Regards,
Emir

On 02.03.2016 09:06, sangeetha.subramanian@gtnexus.com wrote:
> Hi All,
>
> I am trying to understand on how we can have commit issued to solr while indexing documents.
Around 200K to 300K document/per hour with an avg size of 10 KB size each will be getting
into SOLR . JAVA code fetches the document from MQ and streamlines it to SOLR. The problem
is the client code issues hard-commit after each document which is sent to SOLR for indexing
and it waits for the response from SOLR to get assurance whether the document got indexed
successfully. Only if it gets a OK status from SOLR the document is cleared out from SOLR.
>
> As far as I understand doing a commit after each document is an expensive operation.
But we need to make sure that all the documents which are put into MQ gets indexed in SOLR.
Is there any other way of getting this done ? Please let me know.
> If we do a batch indexing, is there any chances we can identify if some documents is
missed from indexing ?
>
> Thanks
> Sangeetha
>

-- 
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


Mime
View raw message