Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 40138 invoked from network); 7 Feb 2008 20:47:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 Feb 2008 20:47:28 -0000 Received: (qmail 39305 invoked by uid 500); 7 Feb 2008 20:47:18 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 39258 invoked by uid 500); 7 Feb 2008 20:47:18 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 39247 invoked by uid 99); 7 Feb 2008 20:47:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Feb 2008 12:47:18 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [208.97.132.5] (HELO spunkymail-a3.g.dreamhost.com) (208.97.132.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Feb 2008 20:46:48 +0000 Received: from [192.168.0.3] (adsl-074-229-189-244.sip.rmo.bellsouth.net [74.229.189.244]) by spunkymail-a3.g.dreamhost.com (Postfix) with ESMTP id 726D115D4F6 for ; Thu, 7 Feb 2008 12:46:48 -0800 (PST) Message-Id: <99FAFE58-FDAD-4280-89CC-A1AC150B9A32@apache.org> From: Grant Ingersoll To: java-dev@lucene.apache.org In-Reply-To: <2C2DB0F3-BAF0-49C2-8595-EC38559EF0C4@ix.netcom.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v915) Subject: Re: postings without position information ? Date: Thu, 7 Feb 2008 15:46:47 -0500 References: <2C2DB0F3-BAF0-49C2-8595-EC38559EF0C4@ix.netcom.com> X-Mailer: Apple Mail (2.915) X-Virus-Checked: Checked by ClamAV on apache.org Search the archive for flexible indexing. There have been a number of discussions on things like this, although I don't know that your specific issue was ever covered, but it seems like it fits in that model. I think there was even a patch at one point in time. -Grant On Feb 7, 2008, at 1:43 PM, robert engels wrote: > I think there are many uses of Lucene that would benefit from 'enum' > fields, aka categories. > > When classifying documents, they are often in one or more categories. > > Lucene could write these posting very efficiently using VINT and RLE > (run length encoding) if the positions information was not stored > (since it is not really useful in these typical cases). > > StartingDocNum|NumberOfDocuments...StartingDocNum|NumberOfDocuments > using a bit of the StartingDocNum to know if it was a series. > > When a lot of documents are in the same category, and they are added > as the same time, the document numbers would be nearly sequential, > allowing very efficient compression. > > Has anyone worked on this? Our previous custom IndexReaderWriter > supported it, and I was wondering if this has made it into the core. > I checked the docs/email and could not find anything. > > Thanks. > > Robert > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-dev-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org