lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: What is the best approach to send lots of XML Messages to Solr to build index?
Date Sun, 15 Jun 2014 15:59:53 GMT
A couple of things:

> Consider indexing them with SolrJ, here's a place to get started:
Especially if you use a SAX-based parser you have more control over memory consumption, it's
on the client after all. And, you can rack together as many clients all going to Solr as you

> Here's a bunch of information about tlogs and commits that might be useful background.
Consider setting your <autoCommit> interval quite short (15 seconds)
with openSearcher set to false. That'll truncate your tlog, although
how that relates to your error is something of a mystery to me...


On Sun, Jun 15, 2014 at 3:14 AM, Mikhail Khludnev
<> wrote:
> Hello Floyd,
> Did you consider to disable tlog?
> Does a file consist of many docs?
> Do you have SolrCloud? Do you use just sh/curl or have a java program?
> DIH is not really performant so far. Submitting roughly ten huge files in
> parallel is a way to perform good. Once again, nuke tlog.
> On Sun, Jun 15, 2014 at 12:44 PM, Floyd Wu <> wrote:
>> Hi,
>> I have many XML Message file formatted like this
>> These files are generated by my index builder daily.
>> Currently I am sending these file through http post to Solr but sometimes I
>> hit OOM exception or pending too many tlog.
>> Do you have better way to "import" these files to Solr to build index?
>> Thanks for the suggestion
>> Floyd
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
> <>
>  <>

View raw message