Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8DDC911F78 for ; Sun, 15 Jun 2014 16:00:23 +0000 (UTC) Received: (qmail 21476 invoked by uid 500); 15 Jun 2014 16:00:19 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 21396 invoked by uid 500); 15 Jun 2014 16:00:19 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 21384 invoked by uid 99); 15 Jun 2014 16:00:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 15 Jun 2014 16:00:19 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of erickerickson@gmail.com designates 209.85.128.177 as permitted sender) Received: from [209.85.128.177] (HELO mail-ve0-f177.google.com) (209.85.128.177) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 15 Jun 2014 16:00:17 +0000 Received: by mail-ve0-f177.google.com with SMTP id i13so1489942veh.8 for ; Sun, 15 Jun 2014 08:59:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=cV+TTH+E/0RlGyeZ+wD7gK4c2cmzNsIr6poxkVV4lLY=; b=bfjuZMkKbnnh3zjD7TNZnZPGvcDEb0q0J8yPdG/6O7Gy7Q1RQIGd9uvAstk3XSR+Nm d47ZHBeTalZe2VvbH/LwgRCVcYz3nnOLn6CMx+F6X+4Yc3s479/SATyoSFtxGOZgFNFS LK8ZxXsygbpI8gh79tAkFlFu9cRL8UM1A8P1ENNwifbTmxQSFm3HcJ2D2ZFDf2cfMiNh 3eUUyYFfdxQrv3TcPIQfBjMOltHrH0QYrkOGQXgttqpmWnHUqpJBaVow8+l/Wm/5pVYD rlJ0/fkk2HN22NPEL3fLdQP0vXJ3uh65h22rcQfw5j/+rQ6nXngBgkPfdHgArupDn4l8 X7QA== MIME-Version: 1.0 X-Received: by 10.52.249.41 with SMTP id yr9mr470282vdc.51.1402847993132; Sun, 15 Jun 2014 08:59:53 -0700 (PDT) Received: by 10.52.120.106 with HTTP; Sun, 15 Jun 2014 08:59:53 -0700 (PDT) In-Reply-To: References: Date: Sun, 15 Jun 2014 08:59:53 -0700 Message-ID: Subject: Re: What is the best approach to send lots of XML Messages to Solr to build index? From: Erick Erickson To: solr-user@lucene.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org A couple of things: > Consider indexing them with SolrJ, here's a place to get started: http://= searchhub.org/2012/02/14/indexing-with-solrj/. Especially if you use a SAX-= based parser you have more control over memory consumption, it's on the cli= ent after all. And, you can rack together as many clients all going to Solr= as you need. > Here's a bunch of information about tlogs and commits that might be usefu= l background. http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-a= nd-commit-in-sorlcloud/. Consider setting your interval quite short (15 seconds) with openSearcher set to false. That'll truncate your tlog, although how that relates to your error is something of a mystery to me... Best, Erick On Sun, Jun 15, 2014 at 3:14 AM, Mikhail Khludnev wrote: > Hello Floyd, > > Did you consider to disable tlog? > Does a file consist of many docs? > Do you have SolrCloud? Do you use just sh/curl or have a java program? > DIH is not really performant so far. Submitting roughly ten huge files in > parallel is a way to perform good. Once again, nuke tlog. > > > On Sun, Jun 15, 2014 at 12:44 PM, Floyd Wu wrote: > >> Hi, >> I have many XML Message file formatted like this >> https://wiki.apache.org/solr/UpdateXmlMessages >> >> These files are generated by my index builder daily. >> Currently I am sending these file through http post to Solr but sometime= s I >> hit OOM exception or pending too many tlog. >> >> Do you have better way to "import" these files to Solr to build index? >> >> Thanks for the suggestion >> >> Floyd >> > > > > -- > Sincerely yours > Mikhail Khludnev > Principal Engineer, > Grid Dynamics > > >