Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 39B57CE93 for ; Mon, 7 May 2012 14:53:05 +0000 (UTC) Received: (qmail 4071 invoked by uid 500); 7 May 2012 14:53:03 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 4024 invoked by uid 500); 7 May 2012 14:53:03 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 4014 invoked by uid 99); 7 May 2012 14:53:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 May 2012 14:53:02 +0000 X-ASF-Spam-Status: No, hits=3.0 required=5.0 tests=FORGED_YAHOO_RCVD,SPF_NEUTRAL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [216.139.236.26] (HELO sam.nabble.com) (216.139.236.26) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 May 2012 14:52:57 +0000 Received: from ben.nabble.com ([192.168.236.152]) by sam.nabble.com with esmtp (Exim 4.72) (envelope-from ) id 1SRPIi-0004l3-Rr for java-user@lucene.apache.org; Mon, 07 May 2012 07:52:36 -0700 Date: Mon, 7 May 2012 07:52:36 -0700 (PDT) From: "Zeynep P." To: java-user@lucene.apache.org Message-ID: <1336402356852-3968723.post@n3.nabble.com> In-Reply-To: References: <1335953513390-3954762.post@n3.nabble.com> Subject: Re: pruning package- pruneAllPositions MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Thanks for the link. I reviewed it. Here are more details about the exception: I used contrib/benchmark/conf/wikipedia.alg to index wikipedia dump with MAddDocs: 200000. I wanted to index only a specific period of time so I added an if statement in doLogic of AddDocTask class. I tried to prune the index by using pruning package (CarmelTopKPruning) and I had the exception. I added System.out.println(term); as the first line of the initPositionsTerm and System.out.println("***" + term); as the last line of it. Carmel top k exception comes from pruneAllPositions (throw new IOException("termPositions.doc > docs[docsPos].doc"); ). For example, for token body:freely I had the output as follows: body:freely ***body:freely body:freely ***body:freely body:freely ***body:freely Carmel topk in exception (docs[docsPos].doc = 4414, termPositions.doc() = 4995) Carmel topk in exception (docs[docsPos].doc = 4414, termPositions.doc() = 4996) Carmel topk in exception (docs[docsPos].doc = 4414, termPositions.doc() = 4997) .. Carmel topk in exception Carmel topk in exception Carmel topk in exception Carmel topk in exception Carmel topk in exception Carmel topk in exception Carmel topk in exception Carmel topk in exception Carmel topk in exception body:freely ***body:freely Carmel topk in exception Carmel topk in exception body:freely ***body:freely body:freely ***body:freely I hope that my problem is more clear now. Thanks in advance, Best Regards ZP -- View this message in context: http://lucene.472066.n3.nabble.com/pruning-package-pruneAllPositions-tp3954762p3968723.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org