Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 6568 invoked from network); 9 Jun 2010 14:19:17 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 9 Jun 2010 14:19:17 -0000 Received: (qmail 90225 invoked by uid 500); 9 Jun 2010 14:19:11 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 90153 invoked by uid 500); 9 Jun 2010 14:19:11 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 90077 invoked by uid 99); 9 Jun 2010 14:19:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jun 2010 14:19:11 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of luocan19826164@sohu.com designates 61.135.132.86 as permitted sender) Received: from [61.135.132.86] (HELO relay.mail.sohu.com) (61.135.132.86) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jun 2010 14:19:01 +0000 Received: from 85737ec54efc4e1 (unknown [121.34.53.170]) by relay.mail.sohu.com (Postfix) with ESMTPA id 80F36698DBC for ; Wed, 9 Jun 2010 22:17:35 +0800 (CST) From: "luocanrao" To: Subject: A question bout google search index? Date: Wed, 9 Jun 2010 22:18:17 +0800 Message-ID: <027401cb07de$a3c6a5b0$eb53f110$@com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0275_01CB0821.B1E9E5B0" X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: AcsH3psaygUGspxZRaO6hGQpvLtl3Q== Content-Language: zh-cn X-SOHU-Antispam-Bayes: 0 X-Virus-Checked: Checked by ClamAV on apache.org ------=_NextPart_000_0275_01CB0821.B1E9E5B0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit A news bout google search index. Index system of Lucene can also support realtime search, Is there some difference between them? With Caffeine, we analyze the web in small portions and update our search index on a continuous basis, globally. As we find new pages, or new information on existing pages, we can add these straight to the index. That means you can find fresher information than ever before-no matter when or where it was published. Caffeine lets us index web pages on an enormous scale. In fact, every second Caffeine processes hundreds of thousands of pages in parallel. If this were a pile of paper it would grow three miles taller every second. Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. You would need 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for more than 40 miles ------=_NextPart_000_0275_01CB0821.B1E9E5B0--