Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2FB6AC81B for ; Fri, 19 Jul 2013 16:48:55 +0000 (UTC) Received: (qmail 70304 invoked by uid 500); 19 Jul 2013 16:48:52 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 70051 invoked by uid 500); 19 Jul 2013 16:48:51 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 69747 invoked by uid 99); 19 Jul 2013 16:48:50 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Jul 2013 16:48:50 +0000 Date: Fri, 19 Jul 2013 16:48:50 +0000 (UTC) From: "Robert Muir (JIRA)" To: dev@lucene.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (LUCENE-5123) invert the codec postings API MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Robert Muir created LUCENE-5123: ----------------------------------- Summary: invert the codec postings API Key: LUCENE-5123 URL: https://issues.apache.org/jira/browse/LUCENE-5123 Project: Lucene - Core Issue Type: Wish Reporter: Robert Muir Currently FieldsConsumer/PostingsConsumer/etc is a "push" oriented api, e.g. FreqProxTermsWriter streams the postings at flush, and the default merge() takes the incoming codec api and filters out deleted docs and "pushes" via same api (but that can be overridden). It could be cleaner if we allowed for a "pull" model instead (like DocValues). For example, maybe FreqProxTermsWriter could expose a Terms of itself and just passed this to the codec consumer. This would give the codec more flexibility to e.g. do multiple passes if it wanted to do things like encode high-frequency terms more efficiently with a bitset-like encoding or other things... A codec can try to do things like this to some extent today, but its very difficult (look at buffering in Pulsing). We made this change with DV and it made a lot of interesting optimizations easy to implement... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org