Return-Path: Delivered-To: apmail-lucene-solr-user-archive@locus.apache.org Received: (qmail 893 invoked from network); 12 Dec 2008 22:16:32 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 Dec 2008 22:16:32 -0000 Received: (qmail 72270 invoked by uid 500); 12 Dec 2008 22:16:41 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 72236 invoked by uid 500); 12 Dec 2008 22:16:41 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 72225 invoked by uid 99); 12 Dec 2008 22:16:41 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Dec 2008 14:16:41 -0800 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.136.44.32] (HELO n54.bullet.mail.sp1.yahoo.com) (98.136.44.32) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 12 Dec 2008 22:16:18 +0000 Received: from [216.252.122.219] by n54.bullet.mail.sp1.yahoo.com with NNFMP; 12 Dec 2008 22:15:57 -0000 Received: from [67.195.9.82] by t4.bullet.sp1.yahoo.com with NNFMP; 12 Dec 2008 22:15:57 -0000 Received: from [67.195.9.103] by t2.bullet.mail.gq1.yahoo.com with NNFMP; 12 Dec 2008 22:15:57 -0000 Received: from [127.0.0.1] by omp107.mail.gq1.yahoo.com with NNFMP; 12 Dec 2008 22:15:57 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 244723.24728.bm@omp107.mail.gq1.yahoo.com Received: (qmail 25590 invoked by uid 60001); 12 Dec 2008 22:15:57 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type:Message-ID; b=rkXrr55IHVHUt5pDblQ2X4xMm8wJ9ZnTMZMeC91V0NYgt6DlD6HSrwRl4Tody2bM7/hSlZzp6LuN37yGtxVPBPdhkOF+ZwajnJ5Cn8p59a+jWwNyWb9nw9cTr0wUeAs8pd8EOONsfoIrWJVrKHxDcbspJdmHN7TXlNEtshmCD+Q=; X-YMail-OSG: pFlvkdgVM1m7aTTnUcTc0jEM7ne2grtkw3CMIgqBwaw.hVIa0KvkmryytbB4ebFXqhVYsFLQ4jfko3dOcD2nVI6067eH0eyMKo.jHX0C8GsyIaQqJLNcRix9F8Sri8.41CuGB44Y4WCtq9U0qUUSBy9qlI81l13myBEZsGS7XHgBxqCn1Gr6zDmnGTk3WZ8- Received: from [64.20.188.2] by web111216.mail.gq1.yahoo.com via HTTP; Fri, 12 Dec 2008 14:15:57 PST X-Mailer: YahooMailWebService/0.7.260.1 Date: Fri, 12 Dec 2008 14:15:57 -0800 (PST) From: Kay Kay Reply-To: kaykay.unique@yahoo.com Subject: Re: Solr - DataImportHandler - Large Dataset results ? To: solr-user@lucene.apache.org In-Reply-To: <69de18140812121341t27a77cbejab6d95d999573eaf@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="0-2038308922-1229120157=:24606" Message-ID: <151701.24606.qm@web111216.mail.gq1.yahoo.com> X-Virus-Checked: Checked by ClamAV on apache.org --0-2038308922-1229120157=:24606 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable I am using MySQL. I believe (since MySQL 5) supports streaming.=20 On more about streaming - can we assume that when the database driver suppo= rts streaming , the resultset iterator is a forward directional iterator.= =20 If , say the streaming size is 10K records and we are trying to retrieve a = total of 100K records - what exactly happens when the threshold is reached = , (say , the first 10K records were retrieved ).=20 Are the previous set of records thrown away and replaced in memory by the n= ew batch of records.=A0=20 --- On Fri, 12/12/08, Shalin Shekhar Mangar wrote: From: Shalin Shekhar Mangar Subject: Re: Solr - DataImportHandler - Large Dataset results ? To: solr-user@lucene.apache.org Date: Friday, December 12, 2008, 9:41 PM DataImportHandler is designed to stream rows one by one to create Solr documents. As long as your database driver supports streaming, you should b= e fine. Which database are you using? On Sat, Dec 13, 2008 at 2:20 AM, Kay Kay wrote: > As per the example in the wiki - > http://wiki.apache.org/solr/DataImportHandler - I am seeing the followin= g > fragment. > > url=3D"jdbc:hsqldb:/temp/example/ex" user=3D"sa" /> > > > > > ...................... > > > > > My scaled-down application looks very similar along these lines but where > my resultset is so big that it cannot fit within main memory by any chance. > > So I was planning to split this single query into multiple subqueries - > with another conditional based on the id . ( id < 0 and id > 100 , say ) . > > I am curious if there is any way to specify another conditional clause , > (, where the column is supposed to > be an integer value) - and internally , the implementation could actually > generate the subqueries - > > i) get the min , max of the numeric column , and send queries to the > database based on the batch size > > ii) Add Documents for each batch and close the resultset . > > This might end up putting more load on the database (but at least the > dataset would fit in the main memory ). > > Let me know if anyone else had run into similar issues and how this was > encountered. > > > --=20 Regards, Shalin Shekhar Mangar. =0A=0A=0A --0-2038308922-1229120157=:24606--