Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 67343 invoked from network); 26 Oct 2009 09:43:26 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 26 Oct 2009 09:43:26 -0000 Received: (qmail 8668 invoked by uid 500); 26 Oct 2009 09:43:23 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 8597 invoked by uid 500); 26 Oct 2009 09:43:23 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 8587 invoked by uid 99); 26 Oct 2009 09:43:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Oct 2009 09:43:23 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [85.25.71.29] (HELO mail.troja.net) (85.25.71.29) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Oct 2009 09:43:13 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.troja.net (Postfix) with ESMTP id 6874545E25E for ; Mon, 26 Oct 2009 10:42:53 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at mail.troja.net Received: from mail.troja.net ([127.0.0.1]) by localhost (megaira.troja.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5nBHc0wNQWBp for ; Mon, 26 Oct 2009 10:42:44 +0100 (CET) Received: from VEGA (unknown [134.102.249.78]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.troja.net (Postfix) with ESMTPSA id CE03545E252 for ; Mon, 26 Oct 2009 10:42:43 +0100 (CET) From: "Uwe Schindler" To: References: <5d53d5770905100803q40ecd7faq15e0d3f3f421e8ef@mail.gmail.com> <16DFAC638276409D8EDAD84AF0C48951@VEGA> <26056543.post@talk.nabble.com> Subject: RE: Distinct terms values? (like in Luke) Date: Mon, 26 Oct 2009 10:42:42 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 In-Reply-To: <26056543.post@talk.nabble.com> Thread-Index: AcpWHgz2A28FuuFhSNew81EFYUOt8QAAgaiA X-Virus-Checked: Checked by ClamAV on apache.org > @Test > public void distinct() throws Exception { > RAMDirectory directory = new RAMDirectory(); > IndexWriter writer = new IndexWriter(directory, new > WhitespaceAnalyzer(), true, IndexWriter.MaxFieldLength.UNLIMITED); > > for (int l = -2; l <= 2; l++) { > Document doc = new Document(); > doc.add(new Field("text", "the big brown", Field.Store.NO, > Field.Index.ANALYZED)); > doc.add(new NumericField("trie", Field.Store.NO, > true).setIntValue(l)); > writer.addDocument(doc); > } > > writer.close(); > > IndexReader reader = IndexReader.open(directory, true); > TermEnum termEnum = reader.terms(new Term("trie", "")); > Term next = termEnum.term(); > List ints = new ArrayList(); > > while (next != null && next.field().equals("trie")) { > ints.add(NumericUtils.prefixCodedToInt(next.text())); > next = termEnum.next() ? termEnum.term() : null; > } > > reader.close(); > > log.info(ints.toString()); > } > > ==> [-2, -1, 0, 1, 2, -16, 0, -256, 0, -4096, 0, -65536, 0, -1048576, 0, > -16777216, 0, -268435456, 0] You can add a check in your while statement to break iteration, if the next lower precision is used: while (next != null && next.field().equals("trie") && next.term().charAt(0) == NumericUtils.SHIFT_START_INT)... use the same constant for float, and SHIFT_START_LONG for long and double. This should work. Maybe we add a method to NumericUtils that checks this and returns true/false if the term is not of highest precision. Uwe --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org