Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 54247 invoked from network); 12 Apr 2010 18:15:47 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 12 Apr 2010 18:15:47 -0000 Received: (qmail 72775 invoked by uid 500); 12 Apr 2010 18:15:44 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 72743 invoked by uid 500); 12 Apr 2010 18:15:44 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 72735 invoked by uid 99); 12 Apr 2010 18:15:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Apr 2010 18:15:44 +0000 X-ASF-Spam-Status: No, hits=2.9 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.208.4.194] (HELO mout.perfora.net) (74.208.4.194) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Apr 2010 18:15:35 +0000 Received: from Rissos (cpe-75-84-68-253.socal.res.rr.com [75.84.68.253]) by mrelay.perfora.net (node=mrus4) with ESMTP (Nemesis) id 0MQROe-1Nvnb125E4-00TpOX; Mon, 12 Apr 2010 14:15:12 -0400 Message-ID: <67314EA3672C42C698A1C552371FB2CA@Rissos> From: "Herbert Roitblat" To: Subject: How to get the tokens for a given document Date: Mon, 12 Apr 2010 11:15:13 -0700 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_4527_01CADA31.6BE47F30" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5843 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 X-Provags-ID: V01U2FsdGVkX1+Y3FxXJrf9prOaL3RNVeMnCX2u3Fy5DImr+QF HLZGmpcFctsoI6v3yHuhMv36/pyX+pF0xw0A4+Gy3kzvOA3N2c TDCuBSXihYbRBod3vMxQw== X-Virus-Checked: Checked by ClamAV on apache.org ------=_NextPart_000_4527_01CADA31.6BE47F30 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi, folks. I appreciate the help people have been offering. Here is my problem. My immediate need is to get the tokens for a = document from the Lucene index. I have a list of documents that I walk, = one at a time. Right now, I am getting the tokens and their frequencies = and the problem is that these stay in the heap as I move from document = to document. Is there another way to get the tokens given a document ID? Thanks, I'm looking for alternative ways to skin this cat. Herb ------=_NextPart_000_4527_01CADA31.6BE47F30--