lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Della Bitta <michael.della.bi...@appinions.com>
Subject Re: Details on why ConccurentUpdateSolrServer is reccommended for maximum index performance
Date Thu, 11 Dec 2014 16:19:27 GMT
Tom:

ConcurrentUpdateSolrServer isn't magic or anything. You could pretty 
trivially write something that takes batches of your XML documents and 
combines them into a single document (multiple <doc> tags in the <add> 
section) and sends them up to Solr and achieve some of the same speed 
benefits.

If you use it, the JavaBin-based serialization in CUSS is lighter as a 
wire format, though: 
http://lucene.apache.org/solr/4_10_2/solr-solrj/org/apache/solr/client/solrj/impl/BinaryRequestWriter.html

Only thing you have to worry about (in both the CUSS and the home grown 
case) is a single bad document in a batch fails the whole batch. It's up 
to you to fall back to writing them individually so the rest of the 
batch makes it in.

Michael

On 12/11/14 11:04, Erick Erickson wrote:
> I don't think so, it uses SolrInputDocuments and
> lists thereof. So if you parse the xml and then
> put things in SolrInputDocuments......
>
> Or something like that.
>
> Erick
>
> On Thu, Dec 11, 2014 at 9:43 AM, Tom Burton-West <tburtonw@umich.edu> wrote:
>> Thanks Eric,
>>
>> That is helpful.  We already have a process that works similarly.  Each
>> thread/process that sends a document to Solr waits until it gets a response
>> in order to make sure that the document was indexed successfully (we log
>> errors and retry docs that don't get indexed successfully), however we run
>> 20-100 of these processes,depending on  throughput (i.e. we send documents
>> to Solr for indexing as fast as we can until they start queuing up on the
>> Solr end.)
>>
>> Is there a way to use CUSS with XML documents?
>>
>> ie my second question:
>>> A related question, is how to use ConcurrentUpdateSolrServer with XML
>>> documents
>>>
>>> I have very large XML documents, and the examples I see all build
>> documents
>>> by adding fields in Java code.  Is there an example that actually reads
>> XML
>>> files from the file system?
>> Tom


Mime
View raw message