Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 97470 invoked from network); 5 May 2008 13:04:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 5 May 2008 13:04:25 -0000 Received: (qmail 73324 invoked by uid 500); 5 May 2008 13:04:20 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 73271 invoked by uid 500); 5 May 2008 13:04:20 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 73260 invoked by uid 99); 5 May 2008 13:04:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 May 2008 06:04:20 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 May 2008 13:03:35 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id A4E90234C114 for ; Mon, 5 May 2008 06:03:55 -0700 (PDT) Message-ID: <885875387.1209992635674.JavaMail.jira@brutus> Date: Mon, 5 May 2008 06:03:55 -0700 (PDT) From: "Jason Rutherglen (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-1278) Add optional storing of document numbers in term dictionary In-Reply-To: <20844020.1209729535629.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594231#action_12594231 ] Jason Rutherglen commented on LUCENE-1278: ------------------------------------------ Storing the docs is off by default and will add index size only if the user wishes. The byte blob allows not reading the docs when loaddocs is false. Field cache and range query loading is very slow because of the dual seeks per term (for termenum then termdocs). If in a separate file the terms are redundant. An field cache example: protected Object createValue(IndexReader reader, Object entryKey) throws IOException { Entry entry = (Entry) entryKey; String field = entry.field; IntParser parser = (IntParser) entry.custom; final int[] retArray = new int[reader.maxDoc()]; // TermDocs termDocs = reader.termDocs(); //TermEnum termEnum = reader.terms (new Term (field, "")); TermEnum termEnum = reader.terms (new Term (field, ""), true); try { do { Term term = termEnum.term(); if (term==null || term.field() != field) break; int termval = parser.parseInt(term.text()); int[] docs = termEnum.docs(); for (int x=0; x < docs.length; x++) { retArray[docs[x]] = termval; } //termDocs.seek (termEnum); //while (termDocs.next()) { // retArray[termDocs.doc()] = termval; //} } while (termEnum.next()); } finally { //termDocs.close(); termEnum.close(); } return retArray; } > Add optional storing of document numbers in term dictionary > ----------------------------------------------------------- > > Key: LUCENE-1278 > URL: https://issues.apache.org/jira/browse/LUCENE-1278 > Project: Lucene - Java > Issue Type: New Feature > Components: Index > Affects Versions: 2.3.1 > Reporter: Jason Rutherglen > Priority: Minor > Attachments: lucene.1278.5.4.2008.patch, lucene.1278.5.5.2008.2.patch, lucene.1278.5.5.2008.patch > > > Add optional storing of document numbers in term dictionary. String index field cache and range filter creation will be faster. > Example read code: > {noformat} > TermEnum termEnum = indexReader.terms(TermEnum.LOAD_DOCS); > do { > Term term = termEnum.term(); > if (term == null || term.field() != field) break; > int[] docs = termEnum.docs(); > } while (termEnum.next()); > {noformat} > Example write code: > {noformat} > Document document = new Document(); > document.add(new Field("tag", "dog", Field.Store.YES, Field.Index.UN_TOKENIZED, Field.Term.STORE_DOCS)); > indexWriter.addDocument(document); > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org