Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 8669 invoked from network); 21 Jul 2010 21:50:49 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 21 Jul 2010 21:50:49 -0000 Received: (qmail 13179 invoked by uid 500); 21 Jul 2010 21:50:48 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 13125 invoked by uid 500); 21 Jul 2010 21:50:47 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 13118 invoked by uid 99); 21 Jul 2010 21:50:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Jul 2010 21:50:47 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Jul 2010 21:50:45 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o6LLgsp0019904 for ; Wed, 21 Jul 2010 21:42:54 GMT Message-ID: <5762634.506201279748574433.JavaMail.jira@thor> Date: Wed, 21 Jul 2010 17:42:54 -0400 (EDT) From: "Jason Rutherglen (JIRA)" To: dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-2346) Explore other in-memory postinglist formats for realtime search MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890915#action_12890915 ] Jason Rutherglen commented on LUCENE-2346: ------------------------------------------ Are there any additional thoughts on this one? > Explore other in-memory postinglist formats for realtime search > --------------------------------------------------------------- > > Key: LUCENE-2346 > URL: https://issues.apache.org/jira/browse/LUCENE-2346 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Michael Busch > Assignee: Michael Busch > Priority: Minor > Fix For: 4.0 > > > The current in-memory posting list format might not be optimal for searching. VInt decoding performance and the lack of skip lists would arguably be the biggest bottlenecks. > For LUCENE-2312 we should investigate other formats. > Some ideas: > - PFOR or packed ints for posting slices? > - Maybe even int[] slices instead of byte slices? This would be great for search performance, but the additional memory overhead might not be acceptable. > - For realtime search it's usually desirable to evaluate the most recent documents first. So using backward pointers instead of forward pointers and having the postinglist pointer point to the most recent docID in a list is something to consider. > - Skipping: if we use fixed-length postings ([packed] ints) we can do binary search within a slice. We can also locate a pointer then without scanning and thus skip entire slices quickly. Is that sufficient or would we need more skipping layers, so that it's possible to skip directly to particular slices? > It would be awesome to find a format that doesn't slow down "normal" indexing, but is very efficient for in-memory searches. If we can't find such a fits-all format, we should have a separate indexing chain for real-time indexing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org