Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 63835 invoked from network); 7 Oct 2007 16:49:27 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 Oct 2007 16:49:27 -0000 Received: (qmail 94102 invoked by uid 500); 7 Oct 2007 16:49:09 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 94075 invoked by uid 500); 7 Oct 2007 16:49:09 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 94064 invoked by uid 99); 7 Oct 2007 16:49:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 07 Oct 2007 09:49:09 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of devquestions@gmail.com designates 209.85.198.188 as permitted sender) Received: from [209.85.198.188] (HELO rv-out-0910.google.com) (209.85.198.188) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 07 Oct 2007 16:49:11 +0000 Received: by rv-out-0910.google.com with SMTP id k20so476905rvb for ; Sun, 07 Oct 2007 09:48:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; bh=YJ0ZQTIuwNncokhdutl5ZRyLCmOq6UIvgTR4ThEhoRE=; b=AXsqUfjrJU3YNxPokWV528WpuF0yoVJ+hJhT9e+0hdwFpFpDjMITTGvcphgd6NOuoFd/Gx+GETkHYaSgGwX8OIP0mGLSHetC40LyhHR+dcnbjB97MDcugWlYckoiO6i3huae25cOkHMOVnYrdsrvhIpzsBr0q9Xv9uUG4nlkMO8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=R4avL+Zx7QFGElRIbUpjoNCgSjntEcQ2csQIhgGVDmaQdvcvrKg3jduMGjjXZSgpAKvWNG2tigIC6QqcNGz1x9cJ/BTBVdhP3DPRct7p683E2S6JpdvHYey830D9laHLhMjy5LRtqm/zyY82qMcqRQ0HHxaagx8PhgbOPnR218I= Received: by 10.140.251.1 with SMTP id y1mr2534143rvh.1191775730369; Sun, 07 Oct 2007 09:48:50 -0700 (PDT) Received: by 10.140.148.2 with HTTP; Sun, 7 Oct 2007 09:48:50 -0700 (PDT) Message-ID: Date: Sun, 7 Oct 2007 12:48:50 -0400 From: "Developer Developer" To: java-user@lucene.apache.org Subject: Re: Lucene newbee quesiton- Term Positions In-Reply-To: <359a92830710070938la012e9fyb5e7eae8e3a08f6f@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_15178_8915766.1191775730376" References: <359a92830710070938la012e9fyb5e7eae8e3a08f6f@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_15178_8915766.1191775730376 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi Eric, Thanks for the quick reply. My index does not return any hits when i search for certain phrases . I am very sure that the indexed documents does have those phrases in them. Therefore i want to just list all the terms and their postions for given document just to make sure that the indexed document does have those terms indexed in the correct order. I did check with luke and came up with the following code that does not seem to be working !!. positions.next()) returns flase !. Do you see anything wrong in this code? Directory dir = FSDirectory.getDirectory(args[0]); IndexReader reader = IndexReader.open(dir); TermPositions positions = reader.termPositions(); while(positions.next()) { positions.nextPosition(); positions.nextPosition(); byte b[] = positions.getPayload(null, 0); System.out.println(b); } On 10/7/07, Erick Erickson wrote: > > I suspect that this is more work than you think, not to mention > very slow. This is just due to the nature of an inverted > index.... > > To see what I mean, get a copy of Luke and have it > reconstruct one of your documents and you'll see what the > performance is like. > > I think Luke has all the example code you could ask for, that's > the place I'd look first. See: > http://lucene.apache.org/java/docs/contributions.html > > Why do you want to do this and is it really necessary? You > could think about storing the entire document, then when you > needed to count terms, just using one of the tokenizers and > counting them yourself.... > > Best > Erick > > On 10/7/07, Developer Developer wrote: > > > > Hello, > > > > I have simple lucene 2.2 index created. I want to list all the terms > and > > their positions in a document. how can I do it ? > > > > Can you please provide some sample code. > > > > Thanks ! > > > ------=_Part_15178_8915766.1191775730376--