Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 1725 invoked from network); 22 Mar 2008 13:11:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 22 Mar 2008 13:11:53 -0000 Received: (qmail 9154 invoked by uid 500); 22 Mar 2008 13:11:45 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 9124 invoked by uid 500); 22 Mar 2008 13:11:45 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 9112 invoked by uid 99); 22 Mar 2008 13:11:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 22 Mar 2008 06:11:45 -0700 X-ASF-Spam-Status: No, hits=3.8 required=10.0 tests=FORGED_AOL_TAGS,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of johnmunir@aol.com designates 64.12.137.5 as permitted sender) Received: from [64.12.137.5] (HELO imo-m24.mx.aol.com) (64.12.137.5) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 22 Mar 2008 13:10:56 +0000 Received: from johnmunir@aol.com by imo-m24.mx.aol.com (mail_out_v38_r9.3.) id h.d2e.1a1f0cb6 (34928) for ; Sat, 22 Mar 2008 09:11:13 -0400 (EDT) Received: from MBLK-M35 (mblk-m35.mblk.aol.com [64.12.136.79]) by cia-da04.mx.aol.com (v121.4) with ESMTP id MAILCIADA041-887047e5057117f; Sat, 22 Mar 2008 09:11:13 -0400 To: java-user@lucene.apache.org Subject: Field name size and index size Date: Sat, 22 Mar 2008 09:11:13 -0400 X-MB-Message-Source: WebUI X-AOL-IP: 96.237.168.95 X-MB-Message-Type: User MIME-Version: 1.0 From: John Content-Type: multipart/alternative; boundary="--------MB_8CA5A3556C555DF_964_4453_MBLK-M35.sysops.aol.com" X-Mailer: AOL Webmail 35304-STANDARD Received: from 96.237.168.95 by MBLK-M35.sysops.aol.com (64.12.136.79) with HTTP (WebMailUI); Sat, 22 Mar 2008 09:11:13 -0400 Message-Id: <8CA5A3556C2F38C-964-21CA@MBLK-M35.sysops.aol.com> X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Flag: NO ----------MB_8CA5A3556C555DF_964_4453_MBLK-M35.sysops.aol.com Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii" Hi, Lets say my data source consists of records like so (the example is Field=Value): ? AAAAAAAAAA=Value1 ? BBBBBBBBBB=Value2 ? CCCCCCCCCC=Value3 ? DDDDDDDDDD=Value4 And lets say I a second copy of my data but this time it looks like so: ? A=Value1 ? B=Value2 ? C=Value3 ? D=Value4 I..e, same data, the only change is the field names?are now shorter Now if?i create two Lucene indexes ... one using the long field name and one using the short field name (my data has not changed) .. will the Lucene index size be smaller for the short field name one?? Will updating and optimizing the index be faster?? Will searching be faster? That is, i'm I better off using short field names vs. long field names? Yes, i will do some performance analyses .. but i want to know if this matters before I do so. Thanks in advance! -JM ----------MB_8CA5A3556C555DF_964_4453_MBLK-M35.sysops.aol.com--