Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 43985 invoked from network); 15 Jun 2007 15:11:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Jun 2007 15:11:39 -0000 Received: (qmail 9888 invoked by uid 500); 15 Jun 2007 15:11:35 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 9856 invoked by uid 500); 15 Jun 2007 15:11:35 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 9845 invoked by uid 99); 15 Jun 2007 15:11:35 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jun 2007 08:11:35 -0700 X-ASF-Spam-Status: No, hits=2.9 required=10.0 tests=HTML_10_20,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of erickerickson@gmail.com designates 66.249.92.168 as permitted sender) Received: from [66.249.92.168] (HELO ug-out-1314.google.com) (66.249.92.168) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jun 2007 08:11:31 -0700 Received: by ug-out-1314.google.com with SMTP id m2so950062uge for ; Fri, 15 Jun 2007 08:11:10 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=U5NXFF5buIxtZy5/r7zbDRhhCkG6NJ5gmXqqfhVOJf9G+eq8mVQ6MkYF9TU4cgp6DXerqpmMnKZ7hdfYqLUSgC96u/nPIvQz8woDAMXIodZ9voRsYdeqB+GbLuAjT0xqD336JLU6VMGR4dWFGXC6UPJluMFiFZp0me+2Rlwp65c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=pbJmVY7MvuNr+x21cgzIikRMLECxlor3QfiWnmg/Yw4bg400tPkwBOxYgQDfWfXTsQMiTGsIgmbkfoajjpJa8NfJKA/iE6ItPaKoaRkBxJHseT3+iAxBSe+9pUaqHJMgteIFxniubSm70xoWLv9JRUQcnbi54jV8RMPdccSENRA= Received: by 10.82.134.12 with SMTP id h12mr5892623bud.1181920269281; Fri, 15 Jun 2007 08:11:09 -0700 (PDT) Received: by 10.82.167.3 with HTTP; Fri, 15 Jun 2007 08:11:09 -0700 (PDT) Message-ID: <359a92830706150811h751a208bwa0cc65c7853ee1a1@mail.gmail.com> Date: Fri, 15 Jun 2007 11:11:09 -0400 From: "Erick Erickson" To: java-user@lucene.apache.org Subject: Re: FW: Lucene indexing vs RDBMS insertion. In-Reply-To: <003d01c7af1a$bc11eb90$3435c2b0$@com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_37908_27392209.1181920269195" References: <003d01c7af1a$bc11eb90$3435c2b0$@com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_37908_27392209.1181920269195 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline >From my perspective, this is an irrelevant question. The real question is "is Lucene indexing fast enough for my application?". Which nobody can answer for you, you have to experiment. If you're building an index that's only updated every 6 months, Lucene is certainly "fast enough". If you're recreating the index every 6 seconds, it's a different question. So, I recommend that you create a test application that does nothing except read your source, do whatever parsing you need to do and does NOT index it at all. Record the time it takes. Then try the same thing WITH indexing and record the difference. Then, to get a sense of the dimension of the problem, try substituting inserting into the RDBMS instead of the Lucene index. Once you have numbers, you can make better decisions And people can give you better advice, especially if you include more detail of your design. Best Erick On 6/15/07, Chew Yee Chuang wrote: > > Hi, I'm a new user to Lucene, and heard that it is a powerful tool for > full > text search and I'm planning to use it in my project for data storage > purpose. Before the implementation, I could like to know whether there is > performance issue on Lucene indexing process. I have no doubt on the > retrieving and searching feature in Lucene but the indexing process. I > have > tested my current system to insert 1000 records in RDBMS storage it took > about 1 seconds. Thus, if I change my solution to Lucene, can Lucene > indexing process perform faster than RDBMS ? I have go through some of the > article talking about the "MergeFactor" and "MaxMergeDocs" parameter for > fine tune the indexing process, but no comparison between Lucene indexing > process and RDBMS insertion. Thus, hope someone who have experience in > Lucene can provide this information or some article that discuss between > Lucene and RDBMS. > > > > I really appreciate any help in this. Thanks > > > No virus found in this outgoing message. > Checked by AVG Free Edition. > Version: 7.5.472 / Virus Database: 269.8.16/849 - Release Date: 6/14/2007 > 12:44 PM > > ------=_Part_37908_27392209.1181920269195--