Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 98731 invoked from network); 4 Mar 2008 19:18:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 4 Mar 2008 19:18:45 -0000 Received: (qmail 82355 invoked by uid 500); 4 Mar 2008 19:18:35 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 82314 invoked by uid 500); 4 Mar 2008 19:18:35 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 82302 invoked by uid 99); 4 Mar 2008 19:18:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Mar 2008 11:18:35 -0800 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of willjohnsonsearch@gmail.com designates 72.14.252.153 as permitted sender) Received: from [72.14.252.153] (HELO po-out-1718.google.com) (72.14.252.153) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Mar 2008 19:17:57 +0000 Received: by po-out-1718.google.com with SMTP id b23so813592poe.0 for ; Tue, 04 Mar 2008 11:18:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:to:references:in-reply-to:subject:date:message-id:mime-version:content-type:content-transfer-encoding:x-mailer:thread-index:content-language:from; bh=Q6LL1GE2pUntFX0P2z5aSoNAT/f+gPGwAo4a8q2L1uA=; b=NsFZyFRvxs7yffiZSOxp7tznkx4cFgrBFJVnw9EcWajh5gFOV3YVVDm9aeMUYe9tX8PY73iP843lR7z2wJ7qksmff0xqfCIK2CJKbA5+giBsqAQUjH8KvpVhSHsHCiq+123EnG6GI/P6TadgEQTzrPkDOshy9pBa1I2fh03eBhc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=to:references:in-reply-to:subject:date:message-id:mime-version:content-type:content-transfer-encoding:x-mailer:thread-index:content-language:from; b=DwzsNbo59BVXMUDih4u8I1ONQY3Gwnf5+C5y2UqcwhANYyAbnBfdOlEvS/pq6tE6XW9o/Md1cIDslouTV7wapSZul0ToysRhjc/Xq71Xv9E2oCf0B2qQ/RevWCe0D6qRSYySzt6FmHQ1VfFICuNrCbrGpDXCDKzHqVStl8F9LIQ= Received: by 10.140.208.14 with SMTP id f14mr889481rvg.204.1204658287507; Tue, 04 Mar 2008 11:18:07 -0800 (PST) Received: from will ( [209.104.232.98]) by mx.google.com with ESMTPS id i36sm6620261wxd.9.2008.03.04.11.18.05 (version=SSLv3 cipher=RC4-MD5); Tue, 04 Mar 2008 11:18:06 -0800 (PST) To: References: <214FF1B5E37DC84D9968F0F82FBB1125031B7E32@AUGEXCH.ghsinc.com> In-Reply-To: <214FF1B5E37DC84D9968F0F82FBB1125031B7E32@AUGEXCH.ghsinc.com> Subject: RE: Why indexing database is necessary? (RE: indexing database) Date: Tue, 4 Mar 2008 14:18:04 -0500 Message-ID: <000301c87e2c$798b1d00$6ca15700$@com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: Ach+HooL9xFEZnEYSmuPNBF/MA6pigABk+xwAABSPNAAAI+vMAAA7S+Q Content-Language: en-us From: Will Johnson X-Virus-Checked: Checked by ClamAV on apache.org Don't forget the number 1 reason: speed. For certain types of queries a search engine can return results orders of magnitude faster than a database. I've seen search engines return hits in hundreds of milliseconds when the same database query took hours or even days. That's not to say that a search engine is always better, just the it often times is for when the inputs and outputs are carefully defined. - will -----Original Message----- From: Darren Hartford [mailto:dhartford@ghsinc.com] Sent: Tuesday, March 04, 2008 1:52 PM To: java-user@lucene.apache.org Subject: RE: Why indexing database is necessary? (RE: indexing database) Indexing with lucene/nutch on top of/instead of DB indexing for: 1) relativity scoring 2) alias searching (i.e. a large amount of aliases, like first names) 3) highlighting 4) cross-datasource searching (multi DB, DB + XML files, etc). As for best approach to externally index, I do not have any direct pointers. I would recommend looking at an ETL tool that can be extended for this purpose (I've started writing a plugin for Pentaho, but got pulled off and haven't finished it -- and that was for Solr, not lucene/nutch). -D > -----Original Message----- > From: Duan, Nick [mailto:NDuan@mcdonaldbradley.com] > Sent: Tuesday, March 04, 2008 1:33 PM > To: java-user@lucene.apache.org > Subject: Why indexing database is necessary? (RE: indexing database) > > Could anyone provide any insight on why someone would use nutch/lucene > or any other search engines to index relational databases? With use > cases if possible? Shouldn't the database's own indexing mechanism be > used since it is more efficient? > > If there is such a need of indexing the database content using search > engines, what would be the best approach other than de-normalizing the > database? > > Thanks a lot in advance! > > ND > -----Original Message----- > From: payo [mailto:payo22@yahoo.com] > Sent: Tuesday, March 04, 2008 12:36 PM > To: nutch-user@lucene.apache.org > Subject: indexing database > > > hi to all > > i can index a database with nutch? > > i am use nutch 0.8.1 > > thanks > -- > View this message in context: > http://www.nabble.com/indexing-database-tp15832696p15832696.html > Sent from the Nutch - User mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org