Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 39826 invoked from network); 4 Apr 2005 21:35:08 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 4 Apr 2005 21:35:08 -0000 Received: (qmail 57829 invoked by uid 500); 4 Apr 2005 21:35:06 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 57805 invoked by uid 500); 4 Apr 2005 21:35:06 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 57791 invoked by uid 99); 4 Apr 2005 21:35:06 -0000 X-ASF-Spam-Status: No, hits=0.4 required=10.0 tests=DNS_FROM_RFC_ABUSE,RCVD_BY_IP,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: domain of yseeley@gmail.com designates 64.233.170.205 as permitted sender) Received: from rproxy.gmail.com (HELO rproxy.gmail.com) (64.233.170.205) by apache.org (qpsmtpd/0.28) with ESMTP; Mon, 04 Apr 2005 14:35:05 -0700 Received: by rproxy.gmail.com with SMTP id b11so1362631rne for ; Mon, 04 Apr 2005 14:35:04 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding; b=hbbQhfyXcKgXVhqCUCns6vnXDOh9EhIiGmGebOk2A3MZc5nWdJmX346YMAuMdkWBVMcM+1nTAPS00NdbH2QFcgQj9bvOqbwY15tSlCLZEI2eBhueDZ8gooUM7Z651N4BUulR4k1QIiqP3tWNz+vagJ9G+ZiyPqTOouYVhlMWv8I= Received: by 10.38.151.1 with SMTP id y1mr5704615rnd; Mon, 04 Apr 2005 14:35:04 -0700 (PDT) Received: by 10.38.12.53 with HTTP; Mon, 4 Apr 2005 14:35:04 -0700 (PDT) Message-ID: Date: Mon, 4 Apr 2005 17:35:04 -0400 From: Yonik Seeley Reply-To: Yonik Seeley To: java-dev@lucene.apache.org Subject: scalability w/ number of fields Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N I know Lucene is very scalable in many ways, but how about number of fieldnames? We have an index using around 6000 unique fieldnames, 450,000 documents, and a total index size of 4GB. It's very sparse... documents don't have that many fields, but the number of different fieldtypes is huge. An optimize of this index took about an hour (mergefactor 10, compound index) This is on enterprise hardware (fast SCSI raid, 6GB RAM, dual 2.8GHz Xeon). The JVM was Java5 with 2.5GB heap. This seems very long... anyone have any insights? We'll be running more tests to see if decreasing the number of fields has an impact. -Yonik --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org